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SCARBCRCM GENE. PROMOTE R AWD USES THEREOF 

This application is a continuation-in-part of 
co-pending Application No. 08/638,617, filed April 26, 1996, 
5 the disclosure of which is incorporated by reference in its 
entirety. 

This invention was nade with government support 
under grant number: GM43778 awarded by the National 
10 Institute of Health. The government may have certain rights 
in the invention. 

1. INTRODUOTTQW 
The present invention generally relates to the 
15 SCARECROW (SCR) gene family and their promoters. The 

invention more particularly relates to ectopic expression of 
members of the SCARECROW gene family in transgenic plants to 
artificially modify plant structures. The invention also 
relates to utilization of SCARECROff promoter for tissue and 
20 organ specific expression of heterologous gene products. 

2. BACKGROUND OF THE IMVEMTTO^ 
Asymmetric cell divisions, in which a cell divides 
to give two daughters with different fates, play an important 

25 role in the development of all multicellular organisms. In 
plants, because there is no cell migration, the regulation of 
asymmetric cell divisions is of heightened importance in 
determining organ morphology, in contrast to animal 
embryogenesis, most plant organs are not formed during 

30 embryogenesis. Rather, cells that form the apical meristems 
are set aside at the shoot and root poles. These reservoirs 
of stea cells are considered to be the source of all post- 
embryonic organ development in plants. A fundamental 
question in developmental biology is how meristems function 

35 to generat plant organs. 
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2*1. ROOT DEVELOPMENT 

Root organization is established during 
embryogen sis. This organization is propagated during 
posteinbryonic development by the root aeristea. Following 
5 germination, the development of the postembryonic root is a 
continuous process, a series of initials or stem cells 
continuously divide to perpetuate the pattern established in 
the embryonic root (Steeves & Sussex, 1972, Patterns in Plant 
Development . Englevood Cliffs, NJ: Prentice-Hall, Inc.)* 
10 Due to the organization of the Arabidopsis root it 

is possible to follow the fate of cells from the meristem to 
maturity and identify the progenitors of each cell type 
(Dolan et al. , 1993, Development 119:71-84). The Arabidopsis 
root is a relatively simple and well characterized organ. 
15 The radial organization of the mature tissues in the 

Arabidopsis root has been likened to tree rings with the 
epidermis, cortex, endodermis and pericycle forming radially 
symmetric cell layers that surround the vascular cylinder 
(FIG. lA) . See also Oolan et al., 1993, Development 
20 119:71-84. These mature tissues are derived from four sets 
of stem cells or initials: i) the columella root cap initial; 
ii) the pericycle/vascular initial; iii) the 
epidermal/ lateral root cap initial; and iv) the 
cortex/ endodermal initial (Dolan et al., 1993, Development 
25 119:71-84). It has been shown that these initials undergo 
asymmetric divisions (Scheres et al., 1995, Development 
121:53-62). The cortex /endodermal initial, for example, 
first divides anticlinally (in a transverse orientation) 
(FIG. IB) . This asymmetric division produces another initial 
30 and a daughter cell. The daughter cell, in turn, expands and 
then divides periclinally (in the longitudinal orientation) 
(FIG. IB) . This second asymmetric division produces the 
progenitors of the endodermis and the cortex cell lineages 
(FIG. IB) . 
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2.2. GEHES REGUIATING ROOT ST^UCTDRE 
Mutations that disrupt the asynunetric divisions of 
the cortex/ endodermal initial have be n identified and 
characterized (Benfey et al., 1993, Development 119:57-70; 
S Scheres et al. , 1995, Development 121:53-62). short-root 
(shr) and scarBcrow {scr) mutants are missing a cell layer 
between the epidermis and the pericycle. in both types of 
mutants the cortex/ endodermal initial divides anticlinally, 
but the subsequent periclinal division that increases the 
10 number of cell layers does not take place (Benfey et al., 
1993, Development 119:57-70; Scheres et al., 1995, 
Development 121:53-62). The defect is first apparent in the 
embryo and it extends throughout the entire embryonic axis 
which includes the embryonic root and hypocotyl (Scheres et 
IS al., 1995, Development 121:53-62). This is also true for the 
other radial organization mutants characterized to date, 
suggesting that radial patterning that occurs during 
embryonic development may influence the post-embryonic 
pattern generated by the meristematic initials (Scheres et 
20 al., 1995, Development 121:53-62). 

Characterization of the mutant cell layer in shr 
indicated that two endodermal-specif ic markers were absent 
(Benfey et al., 1993, Development 119:57-70). This provided 
evidence that the wild-type SHR gene may be involved in 
25 specification of endodermis identity. 

2.3. GEOTRQPTSM 

In plants, the capacity for gravitropism has been 
correlated with the presence of amyloplast sedimentation. 

30 see, e.g., Volkaann and Sievers, 1979, Encyclopedia Plant 
Physiol., H.S. vol 7, pp. 573-600; Sack, 1991, Intern. Rev. 
cytol. 127:193-252; Bjttrkmann, 1992, Adv. space Res. 12:195- 
201; Poff et al., in The Physiology of xr-»pHo»o^ Meyerowitz & 
Somerville (eds) ; cold spring Harbor Laboratory Press, 

35 Plainvlew, NY (199<) pp. 639-664; Barl w, 1995, Plant Cell 
Environ. 18:951-962. Amyloplast s dim ntation only occurs in 
cells in specific locations at distinct developmental stages. 
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That is, vh n and vh r s din ntation occurs is precisely 
regulated (Sack, 1991, Int rn. Rev. Cytol- 127:193-252). In 
roots, amyloplast s dimentation only occurs in the c ntral 
(columella) cells of the rootcap; as these cells mature into 
5 peripheral cap cells, the amyloplasts no longer sediment 
(Sack & Kiss, 1989, Amer. J. Bot. 76:454-464; Sievers & 
Braun, in The Root Cap; structure and Function . Wassail et 
al. (edsOf New York: M. Dekker (1996) pp. 31-49). In stems 
of many plants, including Arabidopsis, amyloplast 

10 sedimentation occurs in the starch sheath (endodermis) 

especially in elongating regions of the stem (von Guttenberg, 
Die Phvs ioloaischen Scheiden. Handbuch der Pf lanzenanatomie; 
K. Linsbauer (ed.), Berlin: Gebruder Borntraeger, vol. 5 
(1943) p. 217; sack, 1987, Can. J. Bot. 65:1514-1519; Sack, 

15 1991, Intern. Rev. Cytol. 127:193-252; Caspar & Pickard, 
1989, Planta 177:185-197; Volkmann et al., 1993, J. PI. 
Physiol. 142:710-6) • 

Gravitropic mutants have been studied for evidence 
that proves the role of amyloplast sedimentation in gravity 

20 sensing. However, many gravitropic mutations affect 

downstream events such as auxin sensitivity or metabolism 
(Masson, 1995, BioEssays 17:119-127). other mutations seem 
to affect gene products that process information from gravity 
sensing. For example, the lazy mutants of higher plants and 

25 comparable mutants in mosses can clearly sense and respond to 
gravity, but the mutations reverse the normal polarity of the 
gravitropic response (Gaiser & Lomax, 1993, Plant Physiol. 
102:339-344; Jenkins et al., 1986, Plant Cell Environ 9:637- 
644). Other mutations appear to affect gravitropism of 

30 specific organs. For example, sgr mutants have defective 
shoot gravitropism (Fukaki et al., 1996, Plant Physiol. 
110:933-943; Fukaki et al., 1996, Plant Physiol. 110:945-955; 
Fukaki et al., 1996, Plant Res. 109:129-137). 

Citation or identification of any reference herein 
35 shall not be construed as an admission that such ref rence is 
available as prior art to the present invention. 
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3. SOMMARY O F THE INVENTIOW 

Th structure and function of a regulatory g ne, 
SCARBCRO»t {SCR), is described. The SCR g ne is expr ssed 
specifically in root progenitor tissues of embryos, and in 
5 certain tissues of roots and steas. SCR expression controls 
cell division of certain cell types in roots, and affects the 
organization of root and stea. The invention relates to the 
SCARECROW (SCR) gene (which encompasses the Arabidopsis SCR 
gene and its orthologs and pairalogs) , sCR gene products, 
10 (including but not limited to transcriptional products such 
as mRNAs, antisense and ribozyae molecules, and translational 
products such as the SCR protein, polypeptides, peptides and 
fusion proteins related thereto) , antibodies to SCR gene 
products, SCR regulatory regions and the use of the foregoing 
15 to improve agronomically valuable plants. 

The invention is based, in part, on the discovery. 
Identification and cloning of the gene responsible for the 
scarBcrav phenotype. In contrast to the prevailing view that 
the SCR gene was likely to be involved in the specification 
20 of endodermls, the inventors have determined that the mutant 
cell layer in roots of scr mutants has differentiated 
characteristics of both cortex and endodermls. This is 
consistent with a role for SCR in the regulation of the 
asymmetric cell division rather than in specification of the 
25 identity of either cortex or endodermls. The Inventors have 
also determined that SCR expression affects the gravitropism 
of plant aerial structures such as the stem. 

One aspect of the invention relates to the 
heterologous expression of SCR genes and related nucleotide 
30 sequences, and specifically the Arabidopsis SCR genes, in 
stably transformed higher plant species. Modulation of SCR 
expression levels can be used to advantageously modify root 
and aerial structures of transgenic plants and enhance the 
agronomic properties of such plants. 
35 Anoth r aspect of th inv ntlon r lates to the us 

of promoters of SCR gen s, and specifically th use f 
Arabidopsis SCR promoter to control th expression of protein 



wo 97/41152 



PCT/US97/07022 



and RNA products in plants. Plant SCR promot rs have a 
variety of uses, including but not limited to xpr ssing 
het rologous g nes in the mbryo, ro t, root nodul , and st m 
of transformed plants. 
5 The invention is illustrated by working examples 

described infra which demonstrate the isolation of the 
Arabidopsis SCR gene using insertion mutagenesis. More 
specifically, T-DNA tagging of genomic and cDNA clones of the 
Arabidopsis SCR gene are described. Additional working 
10 examples include the isolation of SCR sequences from plant 
genomes using PGR amplification in combination with screening 
of genomic libraries, and heterologous gene expression in 
transgenic plants using SCR promoter expression constructs. 

Structural analysis of the deduced amino acid 
15 sequence of Arabidopsis SCR protein indicates that SCR 

encodes a transcription factor. Northern analysis, in situ 
hybridization analysis and enhancer trap analysis show highly 
localized expression of Arabidopsis SCR in embryos and roots. 
Genetic analysis shows SCR expression also affects 
20 gravitropism of aerial structures (e.g., stems). This 
indicates that SCR is also expressed in those structures. 

Computer analysis of the deduced amino acid 
sequence of Arabidopsis SCR protein with those of Expressed 
Sequence Tag (EST) sequences in GenBank reveals the existence 
25 of at least thirteen SCR genes in Arabidopsis, one SCR gene 
in maize, four SCR genes in rice, and one SCR gene in 
Brassica. A further aspect of the Invention relates to the 
use of such EST sequences to obtain larger and/or complete 
clones of the corresponding SCR gene. 
30 The various embodiments of the claimed invention 

presented herein are by the way of illustration and are not 
meant to limit the invention. 

3.1. DEFINITIONS 
35 As used herein, the terms list d below will have 

the meanings indicated. 



- 6 - 



wo 97/41152 PCTAJS97/07022 



35S 



CDNA 



cis-regulatory 
5 element: ^ 



10 



15 



coding 
se<3uence 



DNA 

EST 

functional 
portion 



cauliflower mosaic virus promoter for the 35S 
transcript 

compl mentary DNA 



A promoter sequence 5' upstream of the TATA 
box that confers specific regulatory response 
to a promoter containing such an element. A 
promoter may contain one or more cis- 
regulatory elements, each responsible ' for a 
particular regulatory response 



sequence that encodes a complete or partial 
gene product (e.g., a complete protein or a 
fragment thereof) 

deoxyribonucleic acid 

expression tagged 



a functional portion of a promoter is any 
portion of a promoter that is capable of 
causing transcription of a linked gene 
sequence, e.g., a truncated promoter 



20 



gene 
fusion 



gene 
25 product 

gene 
sequence 



30 



GUS 



gDNA 



a gene construct comprising a promoter 
operably linked to a heterologous gene, 
wherein said promoter controls the 
transcription of the heterologous gene 



■= the RNA or protein encoded by a gene sequence 

■ sequence that encodes a complete gene product 
(e.g., a complete protein) 

= 1 r 3 -^-Glucuronidase 

« genomic DNA 



heterologous 
gene « 



35 



In the context of gene constructs, a 
heterologous gene means that the gene is 
linked to a promoter that said gene is not 
naturally linked to. The heterologous gene 
may or may not b from the organism 
contributing said pr meter. The heterologous 
g n may ncode m ssenger PNA (mRNA) , 
antisens RNA or ribozymes 



- 7 - 
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homologous 
promoter 



mRNA 

operably 
linked 



a nativ promoter of a g n 
hybridiz s to the sequenc 
described her in 

messenger RNA 



that sel ctively 
f a SCR g ne 



10 



15 



20 



ortholog = 
para log = 
RNA 

RNase = 
SCR 

(italic) 
SCR 

scr = 

(lower case) 
ZCR 



A linkage between a promoter and gene sequence 
such that the transcription of said gene 
sequence is controlled by said promoter 

related gene in a different plant (e.g., maize 
ZCARECROM gene is an ortholog of the 
Arabidopsis SCR gene) 

related gene in the same plant (e.g., 
Arabidopsis SRPal is a paralog of Arabidopsis 
SCR gene) 

ribonucleic acid 

r i bonuc lease 

SCARECROW gene or gene product, encompasses 
SCR and ZCR genes and their or t ho logs and 
paralogs 

SCARECROW protein 

scarBcrow mutant (e.g., scrl) 



maize ZCARECRON gene, a paralog of, 
example, the Arabidopsis SCR gene 



for 



25 



30 



35 



SCR protein means a protein containing sequences or 
a domain substantially similar to one or more motifs (i.e.. 
Motif I-VI), preferably MOTIF III (VHIID) , of Arabidopsis SCR 
protein as shown in FIGS. 13A-F and FIGS. 15A-S. SCR 
proteins include SCR ortholog and paralog proteins having the 
structure and activities described herein. 

SCR polypeptides and peptides include deleted or 
truncated forms of the SCR protein, and fragments 
corresponding to the SCR motifs described herein. 

SCR fusion proteins encompass proteins in which the 
SCR protein or an SCR polypeptide or peptide is fused to a 
heterolog us protein, polypeptide or peptide. 
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SCR gene, nucleot:ide6 or coding sequences means 
. nucl otides, e.g., gDNA or cONA encoding SCR prot in, SCR 
polypeptides or peptides, or SCR fusion proteins. 

SCR gene products include transcriptional products 
5 such as nRNAs, antisense and ribozyne aolecules, as well as 
translational products of the SCR nucleotides described 
herein including but not limited to the SCR protein, 
polypeptides, peptides and/or SCR fusion proteins. 

SCR promoter means the regulatory region native to 
10 the SCR gene in a variety of species, which promotes the 

organ and tissue specific pattern of SCR expression described 
herein. 



4. BRIEF DESCRIPTIOM O F THE VrGXmf .fi 
FIGS. lA-B. Schematic of Arabidopsis root anatomy. 
FIG. lA. Transverse section showing the four tissues, 
epidermis, cortex, ehdodermis and pericycle that surround the 
vascular tissue. In the longitudinal section, the 
epidermal /lateral root cap initials and the cortex/endodermal 
20 initials are shown at the base of their respective cell 
files. FIG. IB. Schematic of division pattern of the 
cortex/endodermal initial. The initial expands then divides 
anticlinally to reproduce itself and a daughter cell. The 
daughter then divides periclinally to produce the progenitors 
25 of the endodermis and cortex cell lineages. Abbreviations: 
C, cortex; Da, daughter cell; E, endodermis; In, initial. 

FIGS. 2A-F. Phenotype of scr mutant plants. 
PIG. 2A. Shown left to right are 12-day scr'2 , scr-l and 
wild-type seedlings grown vertically on nutrient agar medium. 
30 FIG. 2B. 21-day scr-2 mutant plants in soil. FIG. 2C. 

Transverse section through primary root of 7-day scr-2. FIG. 
2D. Transverse section through primary root of 7-day wild- 
type (WT). FIG. 2E. Transverse section through lateral root 
of 12-day scr-l mutant seedling. FIG. 2F. Transverse 
35 section through root regenerated from scr-i callus. Bar, 50 
Mia. Abbreviations: C, cortex; En, ended rmis; Ep, pidermis; 
M, mutant cell layer; P, pericycle; V, vascular tissue. 



- 9 - 
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FIGS. 3A-F. Charact rization of the c llular 
identity of the mutant c 11 lay r. FIG. 3A. Endodenais- 
specific Casparian band staining of transverse sections 
through the priaary root of 7-day scr^l mutant. (Note: the 
5 histochemical stain also reveals xylem cells in the vascular 
cylinder.) FIG. 3B. Casparian band staining of transverse 
sections through the primary root of 7 -day wild-type (WT) . 
FIG. 3C. Immunostaining with the endodermis (and a subset of 
vascular tissue) specific JIM13 monoclonal antibodies on 
10 transverse root sections of scr-2 mutant. FIG. 3D. 

Immunostaining with JIM13 monoclonal antibodies on transverse 
root sections of WT. FIG. 3E. Immunostaining with the JIM7 
monoclonal antibody that stains all cell walls on transverse 
root sections of scx—2 mutant. FIG. 3F. Immunostaining with 
15 JIM7 monoclonal antibodies on transverse root sections of WT. 
Bar, 25 tim. Abbreviations are same as those for description 
of FIGS. 2A-2F and: Ca, casparian strip. 

FIGS. 4A-F. Immunostaining. FIG. 4A. 
Immunostaining with the cortex (and epidermis) specific CCRC- 
20 H2 monoclonal antibodies on transverse root sections of scr-1 
mutant. FIG. 4B. Immunostaining with CCRC-M2 antibodies on 
transverse root sections of scr-2 mutant. FIG. 3C. 
Immunostaining with CCRC-M2 antibodies on transverse root 
sections of wild-type (WT) . FIG. 4D. Immunostaining with 
25 the CCRC-Ml monoclonal antibodies (specific to a cell wall 
epitope found on all cells) on transverse root sections of 
scr-1. FIG. 4E. Immunostaining with CCRC-Ml antibodies on 
transverse root sections of scr^2 . FIG. 4F. Immunostaining 
with CCRC-Ml antibodies on transverse root sections of WT. 
30 Bar, 30 ^m. Abbreviations are same as those for description 
of FIGS. 2A-2F. 

FIG. 5A-E. Structure of the Arabidopsis SCARECROW 
gene. FIG. 5A. Nucleic acid sequence and deduced amino acid 
sequence of the Arabidopsis SCR genomic region (SEQ ID NO:l) 
35 and (SEQ ID NO: 2), respectively. Regulatory sequences 
including: (i) TATA box, (ii) ATG start codon, and (iii) 
potential polyad nylation sequence are under lin d. Within 

- 10 - 



wo 97/41152 



PCT/US97/07022 



th deduc d amino acid sequence homopolyin ric rep ats are 
underlined. PIG. SB. Sch matic diagram of gen nic clone 
indicating possible functional motifs, T-DMA insertion sites 
and subclones used as probes. Abbreviations: Q,S,P,T, region 
5 with homopolymeric repeats of these amino acids; b, region 
with similarity to the basic region of bZIP factors; I and 
II, regions with leucine heptad repeats; E, acidic region. 
FIG. 5C. Compeurison of the charged region found in 
Arabidopsis SCR protein with that found in bZIP transcription 
10 factors, SCR bZIP-like domain (SEQ ID NO: 3), GCN4 (SEQ ID 

NO:4), TGAl (SEQ ID NO: 5), C-Fos (SEQ ID NO: 6), c-JUN (SEQ ID 
NO:7) , CREB (SEQ ID NO:8), Opaque-2 (SEQ ID NO:9) , OBP2 (SEQ 
ID NO: 10), RAF-1 (SEQ ID NO: 11). FIG. 5D. Translations of 
EST clones encoding putative peptide having similarities to 
15 the VHIID domain region of Arabidopsis SCR protein (SEQ ID 
NO: 12), F13896 (SEQ ID NO: 13), Z37192 (SEQ ID NO: 14), and 
Z25645 (SEQ ID NO: 15) are from Arabidopsis, T18310 (SEQ ID 
NO: 17) is from maize and D41474 (SEQ ID NO: 16) is from rice. 
FIG. 5E. The deduced amino acid sequence of the Arabidopsis 
20 SCARECROW gene (SEQ ID NO: 2). 

FIGS. 6A-B. Expression of the Arabidopsis 
SCARECROW gene. FIG. 6A. Northern blot of total RNA from 
wild-type siliques (Si) , roots (R) , leaves (L) and whole 
seedlings (Sd) hybridized with Arabidopsis SCR probe a and 
25 with a probe from the Arabidopsis glutamine dehydrogenase 
(GDH) gene (Melo-Oliveira et al., 1996, Proc. Natl. Acad. 
Sci. USA 93:4718-4723) as a control for RNA integrity. (GDH 
expression is lower in siliques than in vegetative tissues.) 
The 1.6 kb band corresponds to the GDH gene and the 
30 approximately 2.5 kb band corresponds to SCR, Ribosomal RNA 
is shown as a loading control. FIG. 6B. Northern blot of 
Arabidopsis wild-type, scr-I and scr-2 total RNA, probed with 
Arabidopsis SCR probe ••a" corresponding to a cDNA sequence 
shown in PIG. 5B, and with the GDH probe, m scr-2 mutant 
IS additional bands f 4.1 kb and 5.0 kb were detected. 

PIGS. 7A-G. Jji situ hybridization and enhanc r 
trap analyses of Arabidopsis SCR xpresslon. FIG. 7A. SCR 
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RNA xpression detected by in situ hybridization of SCR 
antisense prob to a 1 ngitudinal s ction through the root 
meristem. FIG. 7B. In. situ hybridization of SCR antis nse 
probe to a transverse section in the meristematic region. 
5 FIG. 7C. In situ hybridization of SCR antisense probe to 
late torpedo stage embryo. FIG« 7D. Negative control in 
situ hybridization using a SCR sense probe to a longitudinal 
section through the root neristeni. FIG. 7E. GUS expression 
in a whole mount in the enhancer trap line, ET199 in primary 
2.0 root tip. FIG. 7F. GUS expression in the ET199 line in 

transverse root section in the meristematic region. FIG. 7G. 
GUS expression in ET199 detected in a section through the 
root meristem. GUS expression is observed in the 
cortex/ endodermal initial, and in the first cell in the 
15 endodermal cell lineage but not in the first cell of the 
cortex lineage. Expression in two endodermal layers is 
observed higher up in the root because the section was not 
median at that point. Bar, 50 ^m. Abbreviations are same as 
those in the description of FIGS. 2A-2F. 
20 FIG. 8. Partial nucleotide sequence (SEQ ID NO: 18) 

and deduced amino acid sequence (SEQ ID NO: 19) of the 
Arabidopsis SRPa4 gene. 

FIG. 9. Partial nucleotide sequence (SEQ ID NO: 20) 
and deduced amino acid sequence (SEQ ID NO: 21) of the 
25 Arabidopsis SRPa3 gene. 

FIG. 10. Partial nucleotide sequence (SEQ ID 
NO: 22) of the Arabidopsis SRPal gene. 

FIG. IIA. Nucleotide sequence (SEQ ID NO: 24) and 
deduced amino acid sequence (SEQ ID NO: 25) of the maize Zm- 
30 Sell fragment. 

FIG. 115. Partial nucleotide sequence (SEQ ID 
NO: 25) and deduced amino acid sequence (SEQ ID NO: 26) of the 
maize SRPml gene (Zm-Scl2) . 

FIG, 12A-B. Nucleotide sequence of rice SRPo3 EST 
35 clone. FIG. 12A. Sequence of 5' end f EST clon (SEQ ID 
NO:28). FIG. 12B. Sequ nee of 3' nd of EST cl n (SEQ ID 
NO:29) . 
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FIGS. 13A-F. C nparison of the amino acid sequence 
f m nbers of th SCARBCROtf fanily of g n s. c nserved 
Motifs I through VI are indicated by dash d line above the 
aligned sequences. Consensus sequences are shown in bold. 
5 See Table l for the identity and sequence identifier number 
of each of the sequences shown in this Figure. Hu-scr-1 = 
Hunan SCR paralog (SEQ ID NO: 40). 

FIG. 14. Restriction nap of the approximately 8.8 
kb Eco RI insert DNA of lambda clone, t643, containing the 
10 Arabidopsis SCR gene. The locations of the approximately 5.6 
kb Hindlll-saci fragment subcloned in plasmid LIG l-S/SAC-i-MoB, 
ISAC, and the SCR coding region are indicated below the 
restriction nap. The location of the translational 
initiation site of the SCR gene is at the Nco I site at the 

IS left end of the indicated coding region. The SCR coding 
sequence begins at the translation initiation site and 
extends approximately 1955 nucleotides to its right. E. coli 
DH5a containing plasmid pLI61-3/SAC+MoB, ISAC, has the ATCC 
accession niunber 98031. 

*® FIGS. 15A-S. Comparison of the partial and 

complete amino acid sequences of several plant members of the 
SCARBCRCW family of genes. The amino acid sequences are 
aligned in a manner that maximizes amino acid sequence 
similarity and identity among SCR family members. Each 

25 sequence shown is continuous except where noted otherwise; 
the dots are inserted between two sequence segments in order 
to align homologous segments. "X" in the middle of a 
sequence indicates ambiguity in the corresponding nucleotide 
sequence and, possible termination of the ORF at the "X" 

30 residue site. "X" at the end of a sequence indicates 
termination of the ORF at the "X" residue site. The 
numbering of the amino acid residues is shown at the bottom 
of each figure and is based on the Arabidopsis SCR amino acid 
sequence. Conserved Motifs I through VI are indicated by the 

35 various dashed lines above the figur s. Th n w and old 
nanes of th family members are shown in FIG. ISA. The 
sequenc s of SCR, Tf 1 and Tf4 are of the complete SCR 
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protein. See Table 1 for the identity and the s quence 
* id ntifier numb r of each sequ nee shown in these figur s. 

FIGS. 16A-N. Th partial nucl otide sequences of 
several plant members of the SCARECBON family of genes. **N** 
5 indicates an unknown base. See Table 1 for the identity and 
the sequence identifier number of each sequence shown in 
these f igures . 

FIG. 17A. The partial nucleotide sequence (SEQ ID 
NO: 66) of the maize ZCR gene. 
10 FIG. 17B. The partial amino acid sequence (SEQ ID 

NO: 67) of the maize ZCR gene. The underlined sequence shares 
approximately 80% sequence identity with a corresponding 
sequence of Arabidopsis SCR protein. 

FIG. 18. Comparison of the partial amino acid 
15 sequences of several SCR ortholog sequences amplified from 
the genomes of carrot, soybean and spruce. The SRPdl and 
SRPpl sequences each were obtained by PCR amplification using 
a combination of IF and IR primers. The SRPgl sequence was 
obtained by PCR amplification using a combination of IF and 
20 WP primers. The amino acid sequences are aligned in a 
manner that maximizes amino acid sequence identity and 
similarity amongst these sequences. Each sequence shown is 
continuous except where noted otherwise; the dashes are 
inserted between two sequence segments in order to allow 
25 alignment of homologous segments. **x** in the middle of a 
sequence indicates ambiguity in the corresponding nucleotide 
sequence and, possible termination of the ORF or existence of 
an intron at the "x" residue site. See Table 1 for the 
identity and the sequence identifier number of each sequence 
30 shown in this figure. 

FIG. 19. Comparison of promoter activities in 
transgenic lines and roots. Panel a. A stably transformed 
line containing four copies of the B2 subdomain of the 35S 
promoter of CaMV upstream of GUS (Benfey et al., 1990). GUS 
35 is xpressed in the root tip* Panel b* Roots emerging from 
callus transformed with four copies of the B2 subdomain of 
th 35S promoter fus d to GUS. GUS expr ssion can be se n in 
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the emerging root tips (arrows) . Panel o. Higher 
magnification of a root emerging from th callus in pan 1 b. 
GUS is clearly restrict d t th root tip. The m rph logy of 
roots regenerated from calli often appears abnormal. Panel 
5 d. A transgenic plant regenerated from the calli and roots 
shown in panel b. GUS expression in this plants appears to 
be similar to that of the original line shown in panel a. 
Panel a. ET199, a stably transformed line that contains an 
enhancer trapping construct with a minimal promoter fused to 
10 the GUS coding region inserted 1 kb upstream from the SCR 

coding region. GUS expression is primarily in the endodermal 
layer of the root. Panel f. Roots emerging from calli 
transformed with the SCR promoter :: GUS construct. Expression 
of the GUS gene appears to be limited to an internal layer 
15 (arrows). Panel g. sCR promoter :: GUS transformed root in 
liquid culture. Roots shown in panel f were excised and 
transferred to liquid cultures. GUS expression is primarily 
found in the endodermal layer as in ET199. The expression of 
GUS in the quiescent center, as seen here, is also sometimes 
20 observed in ET199. Bar, 50^m. 

FIG. 20. Analysis of SCR promoter activity in the 
scr mutant background. Panel a. Roots emerging from scr 
calli transformed with the SCR promoter :: GUS construct. 
Roots regenerated from scr calli are very short, GUS 
25 expression appears to be limited to an internal layer of the 
root (arrows) . Panel b. Root regenerated from transformed 
scr calli and transferred to liquid culture. The scr 
phenotype, a single layer between the epidermis and 
pericycle, is easily seen. GUS expression is limited to this 
30 mutant layer, e. Epidermis. M, Mutant Layer. P, Pericycle. 
Bar, 50/im. 

FIG. 21. Molecular Complementation of the scr 
mutant. Panels a, o and e. scr transformed with the SCR 
promoter: iGUS construct. Panels b, d and f. scr transformed 
35 with the SCR pr meter: :srcj? coding region construct. Panels a 
*nd b. Root emerging from scr calli. Arrows point to 
s veral very short roots among many fine root hairs in the 
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scr call! transformed with th SCR promoter : :GUS construct. 
In contrast, roots from scr calli transf rmed with th SCR 
promoter : :5C1? coding region construct appeared to be 
wild-type in length, suggesting molecular complementation by 
S the transgene. Panels o and d. Transgenic roots in liquid 
culture. The scr roots transformed with the SCR 
pr omoter : : GUS construct appeared short, while those 
transformed with the SCR pr omoter :: SCR coding region 
construct appeared of wild-type length. Panels e and f. 

10 Transverse sections through roots emerging from calli. 
Whereas there is only a single cell layer between the 
epidermis and stele in the SCR pr omoter :: GUS transformed 
root, the radial organization of the root transformed with 
the SCR pr omoter :: SCH coding region appeared identical to 

15 wild-type, with both cortex and endodermal layers. E, 

epidermis. M, mutant layer. C, cortex. En, Endodermis. P, 
Pericycle. Bar, 50/im 

FIG. 22. Expression of ZCR in maize root tips. 
Left Panel. Expression of ZCR is in the endodermal layer and 

20 extends down through the region of the quiescent center. 
Right Panel. Higher magnification showing expression in a 
single cell layer through the quiescent center. 

5. DETAILED DESCRIPTIO N OF THE INVENTION 

The invention relates to the SCARECROW {SCR) gene, 
SCR gene products, including but not limited to 
transcriptional products such as mRNAs, antisense and 
ribozyme molecules, and translational products such as the 
SCR protein, polypeptides, peptides and fusion proteins 
related thereto; antibodies to SCR gene products; SCR 
regulatory regions; and the use of the foregoing to improve 
agronomically valuable plants. 

In summary, the data described herein show the 
identification of SCR, a gene involved in the regulation of a 
specific asymmetric division, in controlling gravitropic 
resp ns in aerial structur s, and in controlling pattern 
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formation in ro ts. S quence analysis shows "that t.h SCR 
■ protein has many hallmarks f transcription factors. In situ 
and mark r lin expression studies show that SCR is xpressed 
in the cortex/ endodermal initial of roots before asymmetric 
5 division occurs, and in quiescent center of regenerating 
roots. Together, these findings indicate that SCR gene 
regulates key events that establish the asymmetric division 
that generates separate cortex and endodermal cell lineages, 
and that affect tissue organization of roots. The 

10 establishment of these lineages is not required for cell 

differentiation to occur, because in the absence of division 
the resulting cell acquires mature characteristics of both 
cortex and endodermal cells. However, it is possible that 
SCR functions to establish the polarity of the initial before 

X5 cell division, or that it is involved in generating an 
external polarity that has an effect on asymmetric cell 
division. 

Genetic analysis indicates that SCR expression 
affects gravitropism of plant stems and hypocotyls. This 

20 indicates that scR is also expressed in these aerial 
structures of plants. 

The SCR genes and promoters of the present 
invention have a number of important agricultural uses. The 
SCR promoters of the invention may be used in expression 

2S constructs to express desired heterologous gene products in 
the embryo, root, root nodule, and starch sheath layer in 
stem of transgenic plants transformed with such constructs. 
For example, SCR promoters may be used to express disease 
resistance genes such as lysozymes, cecropins, maganins, or 

30 thionins for anti-bacterial protection or the pathogenesis- 
related (PR) proteins such as glucanases and chitinases for 
anti-fungal protection. SCR promoters also may be used to 
express a variety of pest resistance genes in the 
aforementioned plant structures and tissues. Examples of 

>5 useful gene products for controlling nematodes or ins cts 
include Bacillus thuringionsia endotoxins, protease 
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inhibitors, col lag nas s, chitinase, glucanases, lectins, and 
glycosidas s. 

Gene constzructs that xpr ss or ctopically express 
SCR, and the SCJt-suppression constructs of the invention may 
5 be used to alter the root and/or sten structure, and the 
gravitropism of aerial structures of transgenic plants. 
Since SCR regulates root cell divisions, over express ion of 
SCR can be used to increase division of certain cells in 
roots and thereby form thicker and stronger roots. Thicker 

10 and stronger roots are beneficial in preventing plant 

lodging. Conversely, suppression of SCR expression can be 
used to decrease cell division in roots and thereby form 
thinner roots. Thinner roots are more efficient in uptake of 
soil nutrients. Since SCR affects gravitropism of aerial 

15 structures, overexpression of SCR may be used to develop 
"straighter" transgenic plants that are less susceptible to 
lodging. 

Further, SCR gene sequence may be used as a 
molecular marker for a qualitative trait, e.g., a root or 
20 gravitropism trait, in molecular breeding of crop plants. 

For purposes of clarity and not by way of 
limitation, the invention is described in the subsections 
below in terms of (a) SCR genes and nucleotides; (b) SCR gene 
products; (c) antibodies to SCR gene products; (d) SCR 
25 promoters and promoter elements; (e) transgenic plants which 
ectopically express SCRj (f) transgenic plants in which 
endogenous SCR expression is suppressed; and (g) transgenic 
plants in which expression of a transgene of interest is 
controlled by SCR promoter. 

30 

5.1- SCR GENES 

The SCARECROW genes and nucleotide sequences of the 
invention include: (a) a gene listed below in Table 1 
(hereinafter, a gene comprising any one of the nucleotide 
35 sequ nces shown in FIG. 5A, FIG. 8, FIG. 9, FIG. 10, FIGS. 
IIA-B, FIGS. 12A-B, FIGS. 16A-M, or FIG. 17 A, or a s gment of 
such nucleotide sequences) , or as contained in the clones 
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d scribed her in and d posited with the ATCC (se Section 13, 
- infra) ; (b) nucl otide sequenc that encod s a prot in 
comprising any one of the amin acid sequences shown in FIG. 
5A, FIG. 5D, FIG. 5E, FIG. 8, FIG. 9, FIGS. IIA-B, FIGS. 13 A- 
5 F, FIGS. 15A-S, FIG. 17B or FIG. 18 or a segment of such 
amino acid sequences, or that is encoded by any one of the 
genes and/or nucleotide sequences listed by their sequence 
identifier numbers in Table 1, or any segment of such genes 
and/ or nucleotide sequences, or contained in any one of the 
10 clones described herein and deposited with the ATCC (see 
Section 13, infra); (c) any gene comprising nucleotide 
sequence that hybridizes to the complement of any one of the 
genes and/or nucleotide seqpiences listed by their sequence 
identifier numbers in Table 1, or any segment of such genes 
IS and/or nucleotide sequences, or as contained in any one of 
the clones described herein and deposited with the ATCC, 
under highly stringent conditions, e.g., hybridization to 
filter-bound DMA in 0.5 K NaHPO^, 7% sodium dodecyl sulfate 
(SDS) , 1 nM EDTA at eS^C, and washing in O.lxSSC/0.1% SDS at 
20 68-C (Ausubel F.M. et al., eds., 1989, Current Protocols in 
Molecular Biology, Vol. I, Green Publishing Associates, inc., 
and John Wiley & sons. Inc., New York, at p. 2.10.3) and that 
encodes a gene product functionally equivalent to SCR gene 
product encoded completely or partly by any one of the genes 
25 and/or sequences listed in Table l or any segment of such 

genes and nucleotide sequences, or as contained in any one of 
the clones deposited with the ATCC; (d) any gene comprising 
nucleotide sequence that hybridizes to the complement of any 
one of the sequences listed by their sequence identifier 
30 numbers in Table l, or any segment of such nucleotide 

sequences, or as contained in any one of the clones described 
herein and deposited with the ATCC, under less stringent 
conditions, such as moderately stringent conditions, e.g., 
washing in 0.2XSSC/0.1% SDS at 42»C (Ausubel et al., 1989, 
35 supra) , and %fhich encod s a functionally equivalent SCR gene 
product; (e) any gene comprising nucle tide sequenc that 
hybridizes to the complement of any one of the sequenc s 
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listed by th ir sequ nee identifier numbers in Table 1 or any 
segment f such nucl otid sequ nc s, or as contained in any 
one of the clones described h rein and deposit d with the 
ATCC, under the following low stringency conditions: pre- 
5 hybridization in hybridization solution (HS) containing 43% 
formamide, SxSSC, 1% SDS, 10% dextran sulfate, 0.1% sarkosyl, 
2% block (Genius kit, Boehringer-Mannheim) , followed by 
hybridization overnight at 30 to 33«C using as a probe a DNA 
molecule of approximately 1.6 kb of SEQ ID NO: 1 at a 

10 concentration of 20 ng/ml, followed by washing in 2xSSC/0.1% 
SDS two times for 15 minutes at room temperature and then two 
times at 50*C, and which encodes a functionally equivalent 
SCR gene product; and/or (f) any gene comprising nucleotide 
sequence that encodes a polypeptide or protein containing the 

15 consensus sequence for SCR (i.e., MOTIF III or VHIID) shown 
in P16S. 13B-D or a segment of such polypeptide or protein. 
The partial and complete nucleotide and amino acid sequences 
of SCR genes and encoded proteins and polypeptides included 
in the invention are listed in Table 1 below. 



25 



30 



35 
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Tabl 1. SCR ORTHOLOGS AND PARALOGS 



New Naae Old Name 

5 ARABIDOPSTS 



10 



X5 



20 



SRPal 
SRPa2 
SRPa3 

SRPa4 
SRPaS 

SRPae 
SRPa7 

SRPaB 

SRPa9 

SRPaio 
SRPaii 

SRPal2 

SRPal3 
SCR 

RICE 

SRPol 
SRP02 

30 

SRP03 
SRP04 

35 SRPml 
BRASSICA 
SRPbl 



25 



1110 

Tf4 

3935 

4818 
4871 
12398 
3635 

Tfl 
10964 

11261 
18652 
23196 

33/08 
Scr 



713 
2504 



3989 
11846 

18310 

174 



EST Clone^ 

Z2564S/33772 

Z34599 

Z37192/1 
N96166 

F13896/7 

F13949 

R29793 

T21627 
H76979 
N96767 

T46205 (9468) 
N96653 (21711) 

T78186 
T44774 

T76483 

N37425 

W43803 

W435138 

AA042397 

T46008 



D15490 

D40482 
040607 
D40800 
D41389 

D41474 

C20324 



SEQ ID N08 
Nucleotide^ Aiaino Acid 



22 

20 

18 
45 
51 
55 



47 

49 
53 
57 



1* 



23 

35* 

21 

19 
46 

52 
56 

34' 

48 

50 
54 
58 

41 
2* 



43 
44 



T18310 



H74669 



36 
59 

37 

42 
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Table 1. (Continuea; 



Wew Name 

SRMl 
* SOYBEAN 
SRPgl 
SPRUCE 
SRPpl 



Old Name EST Clone* 



SEQ ID NOs 
Nucleotide' Amino Acid 



N.A. 



N.A. 



N.A. 



N.A. 



N.A. 



N.A. 



60 



62 



64 



61 



63 



65 



10 



Each EST clone is identified by its GenBank accession 
number. Each EST clone corresponds to a deposit of a 
cDNA sequence that matches a part of the nucleotide 
sequence of the corresponding SCR ortholog or paralog. 



15 



N.A. = not applicable. 



20 



The partial or complete nucleotide sequence of the SCR 
orthologs and paralogs listed here are shovm in FIGS. 
5A, 8, 9, 10, llA-B, 12A-B, 16A-M and 17A. 

Contains the complete coding sequence of Arabidopsis SCR 
gene. 

Contains the complete amino acid sequence of Arabidopsis 
SRPa2 , SRPaa, or SCR protein. 



25 



30 



35 
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Functional equival nts of th SCR gen product 
• includ any plant gene product that regulates plant embryo or 
ro t d velopm nt, or, pref rably, that r gulates ro t cell 
division or root tissue organization, or affects gravitropism 
5 of plant aerial structures (e.g., steins and hypocotyls) . 
Functional equivalents of the SCR gene product include 
naturally occurring SCR gene products, and mutant SCR gene 
products, whether naturally occurring or engineered. 

The invention also includes nucleic acid molecules, 
10 preferably DNA molecules, that hybridize to, and are 

therefore the complements of the nucleotide sequences (a) 
through (f ) , in the first paragraph of this section. Such 
hybridization conditions may be highly stringent, less highly 
stringent, or low stringency as described above. In 
15 instances wherein the nucleic acid molecules are 

oligonucleotides ("oligos"), highly stringent conditions may 
refer, e.g., to washing in 6xSSC/0.05% sodium pyrophosphate 
at 37»C (for 14-base oligos) , 48'C (for 17-base oligos) , 55-C 
(for 20-base oligos), and 60»C (for 23-base oligos). These 
20 nucleic acid molecules may act as SCR euitisense molecules, 
useful, for example, in SCR gene regulation and/or as 
antisense primers in amplification reactions of SCR gene 
and/ or nucleic acid sequences. Further, such sequences may 
be used as part of ribozyme and/or triple helix sequences, 
25 also useful for SCR gene regulation. Still further, such 
molecules may be used as components in probing methods 
whereby the presence of a SCARECROW allele may be detected. 

The invention also includes nucleic acid molecules, 
preferably DNA molecules, which are amplified using the 
30 polymerase chain reaction under conditions described in 
Section 5.1.1., infra, and that encode a gene product 
functionally equivalent to a SCR gene product encoded by any 
one of the genes and sequences listed in Table 1 or as 
contained in any one of the clones described herein and 
35 deposited with the ATCC. 

The invention also encompasses (a) DNA vectors that 
contain any of the foregoing gene and/ or coding sequences 
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and/or th ir compl ments (i.e., antlsense or rlbozyne 
molecules) ; (b) DNA xpreasion v c^ors tihat: contain any of 
tti foregoing g n and/or coding segu nces operatively 
associated with a regulatory element that directs the 
5 expression of the gene and/or coding sequences; and (c) 
genetically engineered host cells that contain any of the 
foregoing gene and/or coding sequences operatively associated 
with a regulatory element that directs the expression of the 
gene and/or coding sequences in the host cell. As used 

10 herein, regulatory elements include but are not limited to 
inducible and non- inducible promoters, enhancers, operators 
and other elements known to those skilled in the art that 
drive and regulate expression. 

The invention also encompasses nucleotide sequences 

15 that encode mutant SCR gene products, peptide fragments of 
the SCR gene product, truncated SCR gene products, and SCR 
fusion proteins. These gene products include, but are not 
limited to, nucleotide sequences encoding mutant SCR gene 
products; polypeptides or peptides corresponding to one or 

20 more of the Motifs I-VI as shown in FIGS. 13A-F and FIGS. 
15A-S, or the bZIP, VHIID, or leucine heptad domains of the 
SCR, or portions of these motifs and domains; truncated SCR 
gene products in which one or more of the motifs or domains 
is deleted, e.gr., a truncated, nonfunctional SCR lacking all 

25 or a portion of the Motifs I-VI as shovm in FIGS. 13A-F and 
FIGS. 15A-S, or the bZIP, VHIID, or leucine heptad domains of 
the SCR. Nucleotides encoding fusion proteins may include 
but are not limited to full length SCR, truncated SCR or 
peptide fragments of SCR fused to an unrelated protein or 

30 peptide, such as for example, an enzyme, fluorescent protein, 
or luminescent protein which can be used as a marker. 

In particular, the invention includes, for example, 
fragments of SCR genes encoding one or more of the following 
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domains as shown in PIG. 5E: amino acids 1-264, 265-283, 287- 
316, 410-473, 436-473, and 473-653. 

In addition to th gen and/ r coding secjuences 
described above, hoaologous SCR genes, and other genes 
5 related by DMA sequence, may be identified and may be readily 
isolated, without undue experimentation, by molecular 
biological techniques well known in the art. Hore 
specifically, such homologs include, for example, paralogs 
(i.e., members of the SCR gene family occurring in the same 
10 plant) as well as orthologs (i.e., members of the SCR gene 
family which occur in a different plant species) of the 
Arabidopsis SCR gene. 

A specific embodiment of a SCR gene and coding 
sequence of the invention is Arabidopsis SCR (FIGS. 5A and 
15 5E) . Other specific embodiments include the various SCR 
genes and coding seq[uences listed in Table 1, supra. 

Methods for isolating SCR genes and coding 
sequences are described in detail in Section 5.2, below. 

SCR genes share substantial amino acid sequence 
20 similarities at the protein level and nucleotide sequence 
similarities in their encoding genes. The term 
"substantially siroilar»» or "substantial similarity" when used 
herein with respect to two amino acid sequences means that 
the two sequences have at least 75% identical residues, 
25 preferably at least 85% identical residues and most 

preferably at least 95% identical residues. The same term 
when used herein with respect to two nucleotide sequences 
means that the two sequences have at least 70% identical 
residues, preferably at least 85% identical residues and most 
30 preferably at least 95% identical residues. Determining 
whether two sequences are substantially similar may be 
carried out using any methodologies known to one skilled in 
the art, preferably using computer assisted analysis. For 
example, the alignments showed herein were initially 
35 accomplished by a BLAST search (NCBI using the BLAST network 
server) . The final alignments of SCR family members were 
don manually. 
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Moreover, SCR genes show highly localized 
•expression in embryos and, particularly, roots. Such 
expressi n patterns nay be asc rtain d by Northern 
hybridizations and in situ hybridizations using antisense 
5 probes. 



5.1.1. ISQIATION OF SCR GENES 

The following methods can be used to obtain SCR 
genes and coding sequences from a wide variety of plants, 
10 including but not limited to AraJbidopsis thaliana, Zbb mays, 
Nicotiana tabacum, Daucus carota, Oryza, GlycinB max, Lemna 
gihba, and PlcBa abios. 

Nucleotide sequences encoding an SCR gene or a 
portion thereof may be obtained by PGR amplification of plant 
15 genomic DNA or cDNA. Useful cDNA sources include **free" cDNA 
preparations fi.e., the products of cDNA synthesis) and 
cloned cDNA in cDNA libraries. Root cDNA preparations or 
libraries are particularly preferred. 

The aunplif ication may use, as the 5' -primer (i.e., 
20 forward primer) , a degenerate oligonucleotide that 

corresponds to a segment of a known SCR amino acid sequence, 
preferably from the amino- terminal region. The 3 '-primer 
(i.e., reverse primer) may be a degenerate oligonucleotide 
that corresponds to a distal segment of the same known SCR 
25 amino acid sequence (i.e., carboxyl to the sequence that 
corresponds to the 5 '-primer). For example, the amino acid 
sequence of the Arabidopsis SCR protein (SEQ ID NO: 2) may be 
used to design useful 5' and 3' primers. Preferably, the 
primers corresponds to segments in the Motif III or VHIID 
30 domain of SCR protein (see FIGS. 13B-D and FIGS. 15K-I*) . The 
sequence of the optimal degenerate oligonucleotide probe 
corresponding to a known amino acid sequence may be 
determined by standard algorithms known in the art. See for 
example, Sambrook et al.. Molecular Cloning; A Laboratory 
35 Manual . 2nd ed.. Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY, V 1 2 (19B9)* 
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Furliher, for €unpli£lcat:ion from cDKA sources, the 
3 '-primer may be an oligonucleotid comprising an 3' 
oligo(dT) seguenc • The amplification may also use as 
primers nucleotide sequences of SCR genes or coding sequences 
5 (e.g., any one of the scr sequences and EST sequences listed 
in Table 1) . 

PGR amplification can be carried out, e.g., by use 
of a Perkin-Elmer Cetus thermal cycler and Taq polymerase 
(Gene Amp*) . One can choose to synthesize several different 

10 degenerate primers for use in the PCR reactions. It is also 
possible to vary the stringency of hybridization conditions 
used in priming the PCR reactions, to allow for greater or 
lesser degrees of nucleotide sequence similarity between the 
degenerate primers and the corresponding sequences in the 

15 cDNA library. One of ordinary skill in the art will know 
that the appropriate amplification conditions and parameters 
depend, in part, on the length and base composition of the 
primers and that such conditions may be determined using 
standard formulae. Protocols for executing all PCR 

20 procedures discussed herein are well known to those skilled 
in the art, and may be found in references such as Gelfand, 
^969, PCR Technoloav, Principles and Applic ations for dna 
j^pl4.f jcfttiynr H.A. Erlich, ed. , Stockton Press, New York; 
and Current Protocols m Molec ular Bioloov . Vol. 2, Ch. 15, 

25 Ausubel et al., eds 1988, New York, Wiley & Sons, Inc. 

A PCR amplified secjuence may be molecular ly cloned 
and sequenced. The amplified sequence may utilized as a 
probe to isolate genomic or cDNA clones of a SCR gene, as 
described below. This, in turn, will permit the 

30 determination of a SCR gene's complete nucleotide sequence, 
including its promoter, the analysis of its expression, and 
the production of its encoded protein, as described infra. 

In a preferred embodiment, PCR amplification of SCR 
gene and/ or coding sequences can be carried out according to 

35 th following proc dure: 
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Porvard: 

Name: 

A.A. code: 
S DNA Sequence: 

Name: 

A.A. code: 
DNA Secjuence: 



Name: 

A.A. code: 
DNA Sequence: 

Reverse: 

Name: 

A.A. code: 
15 DNA Sequence: 

Name: 

A.A. code: 
DNA Sequence: 



20 



Name: 

A.A. code: 
DNA Sequence: 

I = Inosine 
N = A/C/G/T 



SCR5AII (23-mer, 2 inosines, 64-mix) 

HFTANQAI 

5' CAT/C TTT/C ACI GCI AAT/C CAA/G GCN AT 3' 

SCR5B (29-mer, 1 inosine, 144-mix) 

VHIID(L/F)D 

5' ACGTCTCGA GTI CAT/C ATA/C/T ATA/C/T GAT/C 
TTN GA 3' 

IF 

LQCAEAV 

(T/C)TI CA(A/G) TG(T/C GCI GA(A/G) GCN GT 



SCR3AIZ (23-mer, 2 inosines, 128-mix) 

PGGPP(H/N/K) (V/L/F)R' 

5' CG/T CCA/C GTG/T TGG IGG ICC NCC NGG 3' 
IR 

AFQVFNGI 

AT ICC (A/G>TT (A/G)AA lAC (C/T)TG (A/G)AA NGC 
4R 

QWPGLFHI 

AT {A/G)TG (A/G)AA IA(A/G) NCC IGG CCA (C/T)TG 



25 



Useful primer combinations include the following: 
SCR5AII+SCR3AII; SCR5B+SCR3AII; IF+IR; and IF+4R 

PCR: 

Reaction mixture (volume 50 m1> : 

-5 Ml lOX amplification buffer containing Mg (Boehringer- 

Mannheim) 
-1 fJLl 10 mH dNTP's 

-1 Ml forward primer (stock concentration: 80 pmol/Ml) 
*1 Ml reverse primer (80 pmol/Ml) 
-DNA (100-300 ng) . 

Begin reaction with "hot start" in which the enzyme is added 
to the mix only after a brief denaturation at a high 
temperature (80<*C) 

35 



Cycles : 
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10 



IS 



94<>C 30 sec - 

80 "C 5 Bin - 
at 

94 5 min - 
2 tines : 



2 tines; 



2 tines : 



brief denaturation (to prevent non-specific 
prining) 

apply th enzym to the tubes (30 tubes/round 
naxinum) 

thorough denaturation 

94 1 nin 

64<>C 5 Bin 

72 2 min 

94 1 min 

62 5 nin 

72 2 nin 

94«C 1 nin 

60-C 5 nin 

72 2 nin 



(reduce the annealing temperature 2»C in every second round), 
untxl 44 "C IS reached after that: 



40 tines: 



94 "C 20 see 
48 "C 1 min 
72 'C 2 nin 



finally, let cool down to 15«»c. 



A SCR gene coding sequence nay also be isolated by 
screening a plant genonic or cDNA library using a SCR 

20 n«cJ-«otide sequence (e.gr., the sequence of any of the SCR 
genes and sequences and EST clone sequences listed in Table 
1.) as hybridization probe. For exanple, the whole or a 
segnent of the Arabidopsis SCR nucleotide sequence (FIG. 5A) 
nay be used. Alternatively, a SCR gene may be isolated fron 

25 such libraries using as probe a degenerate oligonucleotide 
that corresponds to a segnent of a SCR amino acid sequence. 
For exanple, degenerate oligonucleotide probe corresponding 
to a segnent of the Arabidopsis SCR anino acid sequence (FIG. 
5E) nay be used. 

30 In preparation of cDNA libraries, total RNA is 

isolated from plant tissues, preferably roots. Poly (A) + rna 
is isolated fron the total RNA, and cDNA prepared fron the 
poly (A) + RNA, all using standard procedures. See, for 
exanple, Sanbrook et al., Molecular cionin«. ^ j^^L^^^^^y 

35 Maimal, 2d ed., vol. 2 (1989). The cDNAs nay be synthesized 
with a r striction enzyme site at their 3 '-ends by using an 
appropriate priner and further have linkers or adaptors 
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attached at their 5 '-ends to facilitate th insertion of the 
• cDNAs into suitable cDNA cloning vectors. Alternatively^ 
adaptors or link rs may be attach d to the cDNAs after the 
completion of cDNA synthesis. 
5 In preparation of genomic libraries, plant DNA is 

isolated and fragments are generated, some of which will 
encode parts of the whole SCR protein. The DNA may be 
cleaved at specific sites using various restriction enzymes. 
Alternatively, one may use DNase in the presence of manganese 
10 to fragment the DMA, or the DNA can be physically sheared, as 
for example, by sonication. The DNA fragments can then be 
separated according to size by standard techniques, including 
but not limited to, agarose and polyacrylamide gel 
electrophoresis, column chromatography and sucrose gradient 
15 centrifugation. 

The genomic DNA or cDNA fragments can be inserted 
into suitable vectors, including but not limited to, 
plasmids, cosmids, bacteriophages lambda or T^, and yeast 
artificial chromosome (YAC) [See, for example, Sambrook et 
20 al.. Molecular Cloning: A Laboratory Manual, 2d ed. , Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, New York 
(1989); Glover, D.M(edO^ DNA Cloning: A Pr actical Approach, 
MRL Press, Ltd., Oxford, U.K., Vols. I and II (1985)). 

The SCR nucleotide probe, DNA or RNA, should be at 
25 least 17 nucleotides, preferably at least 26 nucleotides, and 
most preferably at least 50 nucleotides in length. The 
nucleotide probe is hybridized under moderate stringency 
conditions and washed under moderate, preferably high 
stringency conditions. Clones in libraries with insert DNA 
30 having substantial homology to the SCR probe will hybridize 
to the probe. Hybridization of the nucleotide probe to 
genomic or cDNA libraries is carried out using methods known 
in the art. One of ordinary skill in the art will know that 
the appropriate hybridization and wash conditions depend on 
35 the length and base composition of the probe and that such 
conditions may be determin d using standard formulae. Se , 
for xampl , Sambrook t al., Molecular Cloning; A Laboratory 



- 30 - 



wo 97/41152 



PCT/US97/07022 



Manual . 2nd d. , Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, MY, Vol. 2, (1989) pp 11,45-11.57 and 15.55- 
15.57. 

The identity of a cloned or amplified SCR gene 
S seqpience can be verified by comparing the amino acid 

sequences of its three open reading frames with the amino 
acid sequence of a SCR gene (e.g., Arabidopsis SCR protein 
[SEQ ID No:2]). A SCR gene or coding sequence encodes a 
protein or polypeptide whose amino acid sec[uence is 

10 substantially similar to that of a SCR protein or polypeptide 
(e.g., the amino acid sequence of any one of the SCR proteins 
and/or polypeptides shown in PIG. 5A, 5E, FIG. 8, FIG. 9, 
FIGS. IIA-B, FIGS. 15A-S, FIG. 17B and PIG. 18). The 
identity of the cloned or amplified SCR gene sequence may be 

15 further verified by examining its expression pattern, which 
should show highly localized expression in the embryo and/or 
root of the plant from which the SCR gene sequence was 
isolated. 

Comparison of the amino acid sequences encoded by a 
20 cloned or amplified sequence may reveal that it does not 

contain the entire SCR gene or its promoter. In such a case 
the cloned or amplified SCR gene sequence may be used as a 
probe to screen a genomic library for clones having inserts 
that overlap the cloned or amplified SCR gene sequence. A 
25 complete SCR gene and its promoter may be reconstructed by 
splicing the overlapping SCR gene sequences. 

5.1.2. EXPRESSION OF SCR GENE PRODUCTS 

SCR proteins, polypeptides and peptide fragments, 
30 mutated, truncated or deleted forms of SCR and/or SCR fusion 
proteins can be prepared for a variety of uses, including but 
not limited to the generation of antibodies, as reagents in 
assays, the identification of other cellular gene products 
involved in regulation of root development; etc. 

SCR translational products includ , but are not 
limited t those pr teins and polypeptides encoded by th scR 
gene sequ noes described in Section 5.1, above. The 
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invention enc mpass s prot ins that are functionally 
equivalent to the SCR gene products described in Section 5.1. 
Such a SCR gene product may contain one r more deletions, 
additions or substitutions of SCR amino acid residues within 
5 the amino aCid sequence encoded by any one of the SCR gene 
sequences described, above, in Section 5.1, but which result 
in a silent change, thus producing a functionally equivalent 
SCR gene product. Amino acid substitutions may be made on 
the basis of similarity in polarity, charge, solubility, 
10 hydrophobicity, hydrophilicity, and/or the eunphipathic nature 
of the residues involved. 

For exeunple, nonpolar (hydrophobic) amino acids 
include alanine, leucine, isoleucine, valine, proline, 
phenylalanine, tryptophan, and methionine; polar neutral 
15 amino acids include glycine, serine, threonine, cysteine, 
tyrosine, asparagine, and glutamine; positively charged 
(basic) amino acids include arginine, lysine, and histidine; 
and negatively charged (acidic) amino acids include aspartic 
acid and glutamic acid. "Functionally equivalent", as 
20 utilized herein, refers to a protein capable of exhibiting a 
substantially similar In vivo activity as the endogenous SCR 
gene products encoded by the SCR gene sequences described in 
Section 5.1, above. Alternatively, "functionally equivalent" 
may refer to peptides capable of regulating gene expression 
25 in a manner substantially similar to the way in which the 
corresponding portion of the endogenous SCR gene product 
would • 

The invention also encompasses mutant SCR proteins 
and polypeptides that agree not functionally equivalent to 

30 the gene products described in Section 5.1. Such a mutant 
SCR protein or polypeptide may contain one or more deletions, 
additions or substitutions of SCR amino acid residues within 
the amino acid sequence encoded by any one the SCR gene 
sequences described above in Section 5.1. , and which result 

35 in loss of on or m r functions of the SCR protein (e.g., 
recognition of a specific nucl ic sequence, binding of an 
transcription fact r, etc.), thus producing a SCR gene 
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product not functionally equivalent to the wild-typ SCR 
• protein. 

While random nutations can be made to SCR DNA 
(using random mutagenesis techniques well known to those 
5 skilled in the art) and the resulting mutant SCRs tested for 
activity, site-directed mutations of the SCR gene and/or 
coding sequence can be engineered (using site-directed 
mutagenesis techniques well known to those skilled in the 
art) to generate mutant SCRs with increased function, (e.g^., 

XO resulting in improved root formation), or decreased function 
(e.g., resulting in suboptimal root function). In 
particular, mutated SCR proteins in which any of the domains 
shown in PIGS. 13A-F are deleted or mutated are within the 
scope of the invention. Additionally, peptides corresponding 

15 to one or more domains of the SCR (e.g., shown in FIGS. 13 A- 
F) , truncated or deleted SCRs, as well as fusion proteins in 
which the full length SCR, a SCR polypeptide or peptide fused 
to an unrelated protein are also within the scope of the 
invention and can be designed on the basis of the SCR 

20 nucleotide and SCR amino acid sequences disclosed in Section 
5.1. above . 

While the SCR polypeptides and peptides can be 
chemically synthesized (e.g.. see Creighton, 1983, Proteins: 
Structures and Molecular Principles, W.H. Freeman & Co. , 

25 N.Y.) large polypeptides derived from SCR and the full length 
SCR may advantageously be produced by recombinant DNA 
technology using techniques well known to those skilled in 
the art for expressing nucleic acid sequences. 

Methods which are well known to those skilled in 

30 the art can be used to construct expression vectors 

containing SCR protein coding sequences and appropriate 
transcriptional/translational control signals. These methods 
include, for example, in vitro recombinant DNA techniques, 
synthetic techniques and in vivo recombination/genetic 

35 recombination. See, for example, the techniques described in 
Sambrook et al., 1989, supra, and Ausubel t al., 1989, 
supra. Alternatively, RNA capable of enc ding SCR protein 
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sequences may be chemically syn^eslzed using, for example, 
synthesiz rs. See, for exampl , the techniques d scribed in 
"Oligonucleotide Synthesis**, 1984, Gait, M.J. ed., IRL Press, 
Oxford. 

5 A variety of host-expression vector systems may be 

utilized to express the SCR gene products of the invention. 
Such host-expression systems represent vehicles by which the 
SCR gene products of interest may be produced and 
subsequently recovered and /or purified from the culture or 
10 plant (using purification methods well known to those skilled 
in the art) , but also represent cells which may, when 
transformed or transfected with the appropriate nucleotide 
coding sequences, exhibit the SCR protein of the invention in 
situ. These include but are not limited to microorganisms 
15 such as bacteria (e.g., JE. coli, fl. subtilis) transformed 

with recombinant bacteriophage DNA, plasmid DNA or cosmid DNA 
expression vectors containing SCR protein coding sequences; 
yeast (e.g., SaccharomycBs , Pichxa) transformed with 
recombinant yeast expression vectors containing the SCR 
20 protein coding sequences; insect cell systems infected with 
recombinant virus expression vectors (e.g., baculovirus) 
containing the SCR protein coding sequences; plant cell 
systems infected with recombinant virus expression vectors 
(e.g., cauliflower mosaic virus, CaMV; tobacco mosaic virus, 
25 TMV) or transformed with recombinant plasmid expression 
vectors (e.g., Ti plasmid) containing SCR protein coding 
sequences; or mammalian cell systems (e.g., COS, CHO, BHK, 
293, 3T3) harboring recombinant expression constructs 
containing promoters derived from the genome of mammalian 
30 cells (e.g., metal lothionein promoter) or from mammalian 
vizruses (e.g., the adenovirus late promoter; the vaccinia 
virus 7.5K promoter; the cytomegalovirus promoter /enhancer; 
etc . ) - 

In bacterial systems, a number of expression 
35 vectors may be advantageously selected depending upon the use 
intended for the SCR protein being expressed. For example, 
when a large quantity of such a protein is to be produc d, 
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for the gen ration f antibodies or to screen peptid 
libraries, for exaaple, v ctors which direct the expression 
of high levels of fusion protein products that are readily 
purified may be desirable. Such vectors include, but are not 
5 limited, to the E. coli expression vector pUR278 (Ruther et 
al., 1983, EMBO J. 2:1791), in which the SCR coding sequence 
may be ligated individually into the vector in frame with the 
lac Z coding region so that a fusion protein is produced; piK 
vectors (Inouye & Inouye, 1985, Nucleic Acids Res. 13:3101- 

10 3109; Van Heeke & Schuster, 1989, J. Biol. Chea. 264:5503- 
5509) ; and the like. pGEX vectors may also be used to 
express foreign polypeptides as fusion proteins with gluta- 
thione S-transf erase (GST). In general, such fusion proteins 
are soluble and can easily be purified from lysed cells by 

15 adsorption to glutathione-agarose beads followed by elution 
in the presence of free glutathione. The pGEX vectors are 
designed to include thrombin or factor Xa protease cleavage 
sites so that the cloned target gene protein can be released 
from the GST moiety. 

such embodiment of a bacterial system, full 
length cDNA sequences are appended with in-frame Bam HI sites 
at the amino terminus and Eco RI sites at the carboxyl 
terminus using standard PCR methodologies (Innis et al., 
1990, supra) and ligated into the pGEX-2TK vector (Pharaacia, 
25 Uppsala, Sweden). The resulting cDNA construct contains a 
kinase recognition site at the amino terminus for radioactive 
labelling and glutathione S-transf erase sequences at the 
carboxyl terminus for affinity purification (Nilsson, et al., 
1985, EMBO J. 4: 1075; Zabeau and Stanley, 1982, EMBO J. 1: ' 



30 1217. 



The recombinant constructs of the present invention 
■ay include a selectable marker for propagation of the 
construct. For example, a construct to be propagated in 
bacteria preferably contains an antibiotic resistance gene 
35 such as one that conf rs resistance to kanamycin, 

tetracyclin , strept mycin, or chloramphenicol. Suitable 
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vectors for propagating the c nstruct Include plasoiids, 
cosnids, bact riophages or virus s, to nane but a few. 

In addition, the rec nbinant constructs nay include 
plant-expressible, selectable, or screenable marlcer genes for 
5 isolating, identifying or tracking plant cells transformed by 
these constructs. Selectable markers include, but are not 
limited to, genes that confer antibiotic resistance, (e.g., 
resistance to kanamycin or hygromycin) or herbicide 
resistance (e.g., resistance to sulfonylurea, 

10 phosphinothricin, or glyphosate) . Screenable markers 
include, but are not be limited to, genes encoding B- 
glucuronidase (Jefferson, 1987, Plant Mol. Biol. Rep. 5:387- 
405), luciferase (Ow et al., 1986, Science 234:856-859), B 
protein that regulates anthocyanin pigment production (Goff 

15 et al., 1990, EHBO J 9:2517-2522). 

In embodiments of the present invention which 
utilize the Agrobacterium tumeracien system for transforming 
plants (see xnfra) , the recombinant constructs may 
additionally comprise at least the right T-DNA border 

20 seq[uences flanking the DNA sequences to be transformed into 
the plant cell. Alternatively, the recombinant constructs 
may comprise the right and left T-DNA border sequences 
flanking the DNA sequence. The proper design and 
construction of such T-DNA based transformation vectors are 

25 well known to those skilled in the art. 

5.1.3, ANTIBODIES TO SCR PROTEINS A ND POLYPEPTIDES 
Antibodies that specifically recognize one or more 
epitopes of SCR, or epitopes of conserved variants of SCR, or 

30 peptide fragments of the SCR are also encompassed by the 
invention. Such antibodies include but are not limited to 
polyclonal antibodies, monoclonal antibodies (mAbs) , 
humanized or chimeric antibodies, single chain antibodies. 
Fab fragments, F(ab')j fragments, fragments produced by a Fab 

35 expression library, ant i- idiotypic (anti-Id) antibodies, and 
epitope^binding fragments of any f the above. 
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For the production f antibodies, various host 
• animals may be immunized by inj ction with th SCR protein, 
an SCR peptide (e.g., one corresponding to a functional 
domain of the protein) , a truncated SCR polypeptide (SCR in 
5 which one or more domains has been deleted) , functional 
equivalents of the SCR protein, or mutants of the SCR 
protein. Such SCR proteins, polypeptides, peptides or fusion 
proteins can be prepared and obtained as described in Section 
5.1.2. supra. Host animals may include but are not limited 
10 to rabbits, mice, and rats, to name but a few. Various 

adjuvants may be used to increase the immunological response, 
depending on the host species, including but not limited to 
Freund's (complete and incomplete), mineral gels such as 
aluminum hydroxide, surface active substances such as 
15 lysolecithin, pluronic polyols, polyanions, peptides, oil 
emulsions, keyhole limpet hemocyanin, dinitrophenol, and 
potentially useful htiman adjuvants such as BCG (bacille 
Calmette*Guerin) and CoryneJbacterium parvum. Polyclonal 
antibodies are heterogeneous populations of antibody 
20 molecules derived from the sera of the immunized animals. 

Monoclonal antibodies, which are homogeneous 
populations of antibodies to a particular antigen, may be 
obtained by any technique which provides for the production 
of antibody molecules by continuous cell lines in culture. 
25 These include, but are not limited to, the hybridoma 

technique of Kohler and Milstein, (Nature 256:495-497 fi975); 
and U.S. Patent No. 4,376,110), the human B-cell hybridoma 
technique (Kosbor et al. , 1983, Immunology Today 4:72; Cole 
et al., 1983, Proc. Natl. Acad. Sci. USA 80:2026-2030), and 
30 the EBV-hybridoma technique (Cole et al. , 1985, Monoclonal 
Antibodies And Cancer Therapy, Alan R. Liss, Inc., pp. 77- 
96) . Such antibodies may be of any immunoglobulin class 
including IgC, IgM, igE, igA, IgD and any subclass thereof. 
The hybridoma producing the mAb of this invention may be 
35 cultivated in vitro or in vivo. Production of high titers of 
■Abs in vivo mak s this th presently pr ferred method of 
pr duction. 
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In addition, t chniqu s d velop d for the 
production of **chii& ric antibodies** (Morris n t al., 1984, 
Proc. Natl. Acad. Sci., 81:6851-6855; Neuberger et al., 1984, 
Nature, 312:604-608; Takeda et al., 1985, Nature, 314:452- 
5 454) by splicing the genes from a mouse antibody molecule of 
appropriate antigen specificity together with genes from a 
hxunan antibody molecule of appropriate biological activity 
can be used. A chimeric antibody is a molecule in which 
different portions are derived from different animal species, 
10 such as those having a variable region derived from a murine 
mAb and a human immunoglobulin constant region. 

In addition, techniques have been developed for the 
production of humanized antibodies. (See, e.g.. Queen, U.S. 
Patent No. 5,585,089.) An immunoglobulin light or heavy 
15 chain variable region consists of a "framework" region 

interrupted by three hypervariable regions, referred to as 
complementer ily determining regions (CDRs) . The extent of 
the framework' region and CDRs have been precisely defined 
(see, **Sequences of Proteins of Immunological Interest", 
20 Kabat, E. et al., U.S. Department of Health and Human 

Services (1983). Briefly, humanized antibodies are antibody 
molecules from non-human species having one or more CDRs from 
the non-human species and a framework region from a human 
immunoglobulin molecule. 
25 Alternatively, techniques described for the 

production of single chain antibodies (U.S. Patent 4,946,778; 
Bird, 1988, Science 242:423-426; Huston et al., 1988, Proc. 
Natl. Acad. Sci. USA 85:5879-5883; and Ward et al., 1989, 
Nature 334:544-546) can be adapted to produce single chain 
30 antibodies against SCR proteins or polypeptides. Single 
chain antibodies are foznoed by linking the heavy and light 
chain fragments of the Fv region via an amino acid bridge, 
resulting in a single chain polypeptide. 

Antibody fragments which recognize specific 
35 epitop s may be g nerated by known t chniques. For example, 
such fragments includ but are not limited to: the F(ab')2 
fragments which can be produced by pepsin dig stion of the 
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antibody nolecul and the Fab fragments which can be 
generat d by r ducing the disulfide bridges of the FCab'), 
fragments. Alternatively, Fab expr ssion libraries nay be 
constructed (Huse et al., 1989, Science, 246:1275-1281) to 
5 allow rapid and easy identification of monoclonal Fab 
fragments with the desired specificity. 

Antibodies to a SCR protein and/or polypeptide can, 
in turn, be utilized to generate anti-idiotype antibodies 
that "mimic" SCR, using techniques well known to those 
10 skilled in the art. (See, e.g., Greenspan & Bona, 1993, 
FASEB J 7(5) :437-444; and Nissinoff, 1991, J. Immunol. 
147(8) :2429-2438) . 



5.1,4. SCR 6EN£ OR GENE PRODUCTS AS 
15 yiARKfiRS FOR QUALITATIVE TRATT unnr 

Any of the nucleotide sequences (including EST 
clone sequences) described in SS 5.1 and 5.1.1. and/or listed 
in Table 1, and/ or polypeptides and proteins described in 
SS 5.1.2. and/or listed in Table l, can be used as markers 

20 for qualitative trait loci in breeding programs for crop 
plants. To this end, the nucleic acid molecules, including 
but not limited to full length SCR coding sequences, and/or 
partial sequences (ESTs) , can be used in hybridization and/or 
DNA amplification assays to identify the endogenous SCR 

25 genes, scr mutant alleles and/ or SCR expression products in 
cultivars as compared to wild-type plants. They can also be 
used as markers for linkage analysis of qualitative trait 
loci. It is also possible that the SCR gene may encode a 
product responsible for a qualitative trait that is desirable 

30 in a crop breeding program. Alternatively, the SCR protein, 
peptides and/or antibodies can be used as reagents in 
immunoassays to detect expression of the SCR gene in 
cultivars and wild-type plants. 

35 
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5.2. SCR PROMOTERS 

According t. the pr 8 nt inv ntion, SCR pr meters 
and functional p rtl ns th r of d scribed h rein r f r to 
regions of the SCR gene which are capable of promoting 
5 tissue-specif ic expression in embryos and/or roots of an 
operably linked coding secpience in plants. The SCR promoter 
described herein refers to the regulatory elements of SCR 
genes, i.e., regulatory regions of genes which are capable of 
selectively hybridizing to the nucleic acids described in 

10 Section 5.1, or regulatory sequences contained, for example, 
in the region between the translational start site of the 
Arabidopsis SCR gene and the HindZZI site approximately 2.5 
kb upstream of the site in plasmid pLZGl*3/SAC+Mob21SAC (see 
FIGS. 5A and 14) in hybridization assays, or which are 

15 homologous by sec[uence analysis (containing a span of 10 or 
more nucleotides in which at least 50 percent of the 
nucleotides are identical to the sequences presented herein) . 
Homologous nucleotide sequences refer to nucleotide sequences 
including, but not limited to, SCR promoters in diverse plant 

20 species (e.g., promoters of orthologs of Arabidopsis SCR) as 
well as genetically engineered derivatives of the promoters 
described herein. 

Methods which could be used for the synthesis, 
isolation, molecular cloning, characterization and 

25 manipulation of SCR promoter sequences are well known to 
those skilled in the art. See, e.g., the techniques 
described in Sambrook et al.. Molecular Cloning: A 
Laboratory Manual, 2nd. ed.. Cold Spring Harbor Laboratory, 
cold Spring Harbor, New York (1989). 
30 According to the present invention, SCR promoter 

sequences or portions thereof described herein may be 
obtained from appropriate plant or mammalian sources from 
cell lines or recombinant DNA constructs containing SCR 
promoter sequences, and/or by chemical synthetic methods. 
35 SCR promoter sequ nces can b obtain d from genomic clones 
c ntaining sequences 5' upstream of SCR coding s quences. 
Such 5' upstream clon s may be obtained by scr ning g nomic 
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libraries using SCR protein coding seguenc s, particularly 
those needing SCR N-terainal sequences, froBi SCR gen clones 
obtained as described in Secti ns 5.1. and 5.2. Standard 
nethods that nay used in such screening include, for exeunple, 
5 the nethod set forth in Benton & Davis, 1977, Science 196: iso 
for bacteriophage libraries; and Grunstein & Hogness, 1975, 
Proc. Nat. Acad. Sci. U.S.A. 72:3961-3965 for plasmid 
libraries. 

The full extent and location of SCR promoters 

10 within such 5' upstream clones may be determined by the 

fimctional assay described below. In the event a 5' upstream 
clone does not contain the entire SCR promoter as determined 
by the functional assay, the insert DMA of the clone may be 
used to isolate genomic clones containing sequences further 

15 5' upstream of the SCR coding sequences. Such further 

upstream sequences can be spliced on to existing 5' upstream 
sequences and the reconstructed 5' upstream region tested for 
functionality as a SCR promoter (i.e., promoting tissue- 
specific expression in embryos and/ or roots of an operably 

20 linked gene in plants) . This process may be repeat until the 
complete SCR promoter is obtained. 

The location of the SCR promoter within genomic 
sequences 5' upstream of the SCR gene isolated as described 
above may be determined using any method known in the art. 

25 For example, the 3 '-end of the promoter may be identified by 
locating the transcription initiation site, which may be 
determined by methods such as RMase protection (e.g., Liang 
et al., 1989, J. Biol. Chem. 264:14486-14498), primer 
extension (e.g., Weissenborn & Larson, 1992, J. Biol. Chem. 

30 267:6122-6131), and/or reverse transcriptase/PCR. The 

location of the 3 '-end of the promoter may be confirmed by 
sequencing and computer analysis, examining for the canonical 
AGGA or TATA boxes of promoters that are typically 50-60 base 
pairs (bp) and 25-35 bp 5 '-upstream of the transcription 

35 initiation site. Th 5'- nd promoter may be d fined by 

del ting s quences fr m the 5'- nd f the promoter containing 
fragm nt, c nstructing a transcriptional or translational 
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fusion of the r sected fragment and a report r g n , and 

xamining the xpresslon characteristics of th chim ric g ne 
in transgenic plants. Report r gen s that may be used to 
such ends include, but are not limited to, GUS, CAT, 
5 lucif erase, /3-galactosidase and CI and R gene controlling 
anthocyanin production. 

According to the present invention, a SCR promoter 
is one that confers to an operably linked gene in a 
transgenic plant tissue-specific expression in roots, root 

10 nodules, stems and/or embryos. A SCR promoter comprises the 
region between about -5,000 bp and -i-l bp upstream of the 
transcription initiation site of SCR gene. In a particular 
embodiment, the Arabidopsis SCR promoter comprises the region 
between positions -2.5 kb and +1 in the 5' upstream region of 

X5 the Arabidopsis SCR gene (see FIGS. 5A and 14). 

5.2.1. CIS-REGUIATORY ELEMENTS OF SCR PROMOTERS 
According to the present invention, the cis- 
regulatory elements within a SCR promoter may be identified 

20 using any method known in the art. For example, the location 
of cis-regulatory elements within an inducible promoter may 
be identified using methods such as DNase or chemical 
footprinting (e.g., Meier et al. , 1991, Plant Cell 3:309-315) 
or gel retardation (e.g., Weissenborn & Larson, 1992, J. 

25 Biol. Chem. 267-6122-6131; Beato, 1989, Cell 56:335-344; 
Johnson et al., 1989, Ann. Rev. Biochem. 58:799-839). 
Additionally, resectioning experiments may also be employed 
to define the location of the cis-regulatory elements. For 
example, an inducible promoter-containing fragment may be 

30 resected from either the 5' or 3 '-end using restriction 
enzyme or exonuclease digests. 

To determine the location of cis-regulatory 
elements within the sequence containing the inducible 
promoter, the 5'- or 3 '-resected fragments, internal 
35 fragments to the inducible promoter containing sequence, or 
inducible promoter fragments containing s quences identified 
by fo tprinting or gel retardation exp riments may be fused 
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to th 5'*end of a truncated plant promoter, and the activity 
of the chin ric promoter in transg nic plant examin d. 
Useful truncat d prom ters to thes ends comprise sequences 
starting at or about the transcription initiation site and 
5 extending to no more than 150 bp 5' upstream. These 
truncated promoters generally are inactive or are only 
minimally active. Examples of such truncated plant promoters 
may include, among others, a ••minimal" CaMV 3 58 promoter 
whose 5^ end terminates at position -4 6 bp with respect to 

10 the transcription initiation site (Skriver et al., Proc. 
Natl. Acad. Sci. USA 88:7266-7270); the truncated "-90 35S" 
promoter in the X-6US-90 vector (Benfey & Chua, 1989, Scienc 
244:174-181); a truncated ""-lOl nos** promoter derived from 
the nopaline synthase promoter (Aryan et al., 1991, Mol. Gen. 

15 Genet. 225:65-71); and the truncated maize Adh-1 promoter in 
pADcat 2 (Ellis et al., 1987, EMBO J. 6:11-16). 

According to the present invention, a cis- 
regulatory element of a SCR promoter is a sequence that 
confers to a truncated promoter tissue-specific expression in 

20 embryos, stems, root nodules and/or roots. 



5»2.2. SCR PROM OTER-DRIVEN EXPRgSSTON VEOTQR53 
The properties of the nucleic acid sequences are 
varied as are the genetic structures of various potential 

25 host plant cells. In the preferred embodiments of the 
present invention, described herein, a number of features 
which an artisan may recognize as not being absolutely 
essential, but clearly advantageous are used. These include 
methods of isolation, synthesis or construction of gene 

30 constructs, the manipulation of the gene constructs to be 
introduced into plant cells, certain features of the gene 
constructs, and certain features of the vectors associated 
with the gene constructs. 

Further, the gene constructs of the present 

35 invention may be encoded on DMA or RNA molecul s. According 
to the pr 8 nt inv nti n, it is preferred that the d sired, 
stable genotypic change of the target plant b effected 
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through genomic integration of xogenously introduc d nucleic 
acid construct (6) , particularly recombinant DNA constructs. 
N nethel ss, according to the present invention, such 
genotypic changes can also be effected by the introduction of 
5 episomes (DNA or RNA) that can replicate autonomously and 
that are somatically and germinally stable. Where the 
introduced nucleic acid constructs comprise SKA, plant 
transformation or gene expression from such constructs may 
proceed through a DNA intermediate produced by reverse 

10 transcription. 

The present invention provides for use of 
recombinant DNA constructs which contain tissue-specific and 
developmental-specific promoter fragments and functional 
portions thereof. As used herein, a functional portion of a 

IS SCR promoter is capable of functioning as a tissue-specific 
promoter in the embryo, stem, root nodule and/ or root of a 
plant. The functionality of such sequences can be readily 
established by any method known in the art. Such methods 
include, for excosple, constructing expression vectors with 

20 such sequences and determining whether they confer tissue- 
specific expression in the embryo, stem, root nodule and/or 
root to an operably linked gene. In a particular embodiment, 
the invention provides for the use of the Arabidopsis SCP 
promoter contained in the sequences depicted in FIGS. 5A and 

25 14 and the insert DNA of plasmid pGEX-2TK^. 

The SCR promoters of the invention may be used to 
direct the expression of any desired protein, or to direct 
the expression of a RNA product, including, but not limited 
to, an **antisen8e** RNA or ribozyme. Such recombinant 

30 constructs generally comprise a native SCR promoter or a 
recombinant SCR promoter derived therefrom, ligated to the 
nucleic acid sequence encoding a desired heterologous gene 
product • 

A recombinant SCR promoter is used herein to refer 
35 to a promoter that c mprises a functional portion of a native 
SCR promoter or a promoter that contains nativ pr moter 
sequenc s that is modified by a regulatory element from a SCR 
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promot r. Alternatively, a reconbinant inducible promot r 
d riv d from the scr promoter may be a chimeric promoter, 
comprising a full-length or truncat d plant prom ter modified 
by the attachment of one or more SCR cis-regulatory elements. 
5 The manner of chimeric promoter constructions may 

be any well known in the art. For examples of approaches 
that can be used in such constructions, see Section 5.1.2., 
above and Fluhr et al., 1986, Science 232:li06-iii2; Ellis et 
al., 1987, EMBO J. 6:11-16; Strittmatter & Chua, 1987, Proc- 
10 Natl. Acad. Sci. USA 84:8986-8990; Poulsen & Chua, 1988, Mol. 
Gen. Genet. 214:16-23; Comai et al., 1991, Plant Mol. Biol. 
15:373-381; Aryan et al., 1991, Mol. Gen. Genet. 225:65-71. 

According to the present invention, where a SCR 
promoter or a recombinant SCR promoter is used to express a 
15 desired protein, the DNA construct is designed so that the 
protein coding sequence is ligated in phase with the 
translational initiation codon downstream of the promoter. 
Where the promoter fragment is missing 5 'leader sequences, a 
DNA fragment encoding both the protein and its 5' RNA leader 
20 sequence is ligated immediately do%mstream of the 

transcription initiation site. Alternatively, an unrelated 
5' RNA leader sequence may be used to bridge the promoter and 
the protein coding sequence. In such instances, the design 
should be such that the protein coding sequence is ligated in 
25 phase with the initiation codon present in the leader 
sequence, or ligated such that no initiation codon is 
interposed between the transcription initiation site and the 
first methionine codon of the protein. 

Further, it may be desirable to include additional 
30 DNA sequences in the protein expression constructs. Examples 
of additional DNA sequences include, but are not limited to, 
those encoding: a 3' untranslated region; a transcription 
termination and polyadenylation signal; an intron; a signal 
peptide (which facilitates the secretion of the protein) ; or 
35 a transit peptide (which targets the prot in to a particular 
cellular compartment such as the nucleus, chloroplast, 
mitochondria, or vacuole) . 
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15 



5.3. PRODUCTION OF TRANSGENIC PLANTS AND PLANT 
CELLS 

According to the present Invention, a de&irabl 

plant or plant cell nay be obtained by transforming a plant 

cell with the nucleic acid constructs described herein. In 

some instances, it may be desirable to engineer a plant or 

plant cell with several different gene constructs. Such 

engineering may be accomplished by transforming a plant or 

plant cell with all of the desired gene constructs 

simultaneously. Alternatively, the engineering may be 

carried out sequentially. That is, transforming with one 

gene construct, obtaining the desired transf ozrmant after 

selection and screening, transforming the transf ormant with a 

second gene construct, and so on. 

In an embodiment of the present invention, 

Agrobacterium is employed to introduce the gene constructs 

into plants. Such transformations preferably use binary 

Agrobacterium T-DNA vectors (Bevan, 1984, Nuc. Acid Res. 

12:8711-8721), and the co-cultivation procedure (Horsch et 

al., 1985, Science 227:1229-1231). Generally, the 

Agrobacterium transformation system is used to engineer 

dicotyledonous plants (Bevan et al., 1982, Ann. Rev. Genet. 

16:357-384; Rogers et al., 1986, Methods Enzymol. 118:627- 

641) . The Agrobacterium transformation system may also be 

used to transform, as well as transfer, DNA to 

monocotyledonous plants and plant cells (see Hernalsteen et 

al., 1984, EMBO J 3:3039-3041; Hooykass-Van Slogteren et al., 

1984, Nature 311:763-764; Grimsley et al., 1987, Nature 

325:1677-179; Boulton et al., 1989, Plant Mol. Biol. 12:31- 

40.; Gould et al., 1991, Plant Physiol. 95:426-434). 

In other embodiments, various alternative methods 

for introducing recombinant nucleic acid constructs into 

plants and plant cells may also be utilized. These other 

methods are particularly useful where the target is a 

monocotyledonous plant or plant cell. Alternative gene 

transf r and transf ormati n methods include, but are not 

limited to, protoplast transformation thr ugh calcium-. 



20 



25 
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polyethylene glycol (PEG)- or el ctroporation-nediated uptake 
of nak d DNA (see Paszkowski et al., 1964, EMBO j 3:2717- 
2722, Potrykus et al., 1985, Mol. Gen. Genet. 199:169-177; 
FroBiB et al., 1985, Proc. Matl. Acad. Sci. USA 82:5824-5828; 
5 Shimanoto, 1989, Nature 338:274-276), and electroporation of 
plant tissues (D'Halluln et al., 1992, Plant Cell 4:1495- 
1505) . Additional methods for plant cell transformation 
include microinjection, silicon carbide mediated DNA uptake 
(Kaeppler et al., 1990, Plant cell Reporter 9:415-418), and 
10 microprojectile bombardment (see Klein et al. , 1988, Proc. 
Natl. Acad. Sci. USA 85:4305-4309; Gordon-Kamm et al., 1990, 
Plant Cell 2:603-618). 

According to the present invention, a wide variety 
of plants may be engineered for the desired physiological and 
15 agronomic characteristics described herein using the nucleic 
acid constructs of the instant invention and the various 
transformation methods mentioned above. In preferred 
embodiments, target plants for engineering include, but are 
not limited to, crop plants such as maize, wheat, rice, 

20 soybean, tomato, tobacco, carrots, peanut, potato, sugar 
beets, sunflower, yam, Arabidopsis, rape seed, and petunia; 
and trees such as spruce. 

According to the present invention, desired plants 
and plant cells may be obtained by engineering the gene 

25 constructs described herein into a variety of plant cell 
types, including but not limited to, protoplasts, tissue 
culture cells, tissue and organ explants, pollen, embryos as 
well as whole plants, in an embodiment of the present 
invention, the engineered plant material is selected or 

30 screened for transformants (i.e., those that have 

incorporated or integrated the introduced gene construct (s) ) 
following the approaches and methods described below. An 
isolated transformant may then be regenerated into a plant. 
Alternatively, the engineered plant material may be 

35 regenerated into a plant, or plantl t, b f re subjecting th 
deriv d plant, r plantlet, t s 1 ction or sere ning for the 
marker gen traits. Procedures for r generating plants from 
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plant cells, tissu s or organs, ither before or after 
. selecting or screening for mark r gen (s) , are well known to 
those skilled in th art. 

A transformed plant cell, callus, tissue or plant 
5 may be identified and isolated by selecting or screening the 
engineered plant material for traits encoded by the marker 
genes present on the transforming DNA. For instance, 
selection may be performed by growing the engineered plant 
material on media containing inhibitory amounts of the 
10 antibiotic or herbicide to which the transforming marker gene 
construct confers resistance. Further, transfoxrmed plants 
and plant cells may also be identified by screening for the 
activities of any visible marker genes (e.g., the fl- 
glucuronidase, lucif erase, B or CI genes) that may be present 
X5 on the recombinant nucleic acid constructs of the present 
invention. Such selection and screening methodologies are 
well known to those skilled in the art. 

Physical and biochemical methods may also be used 
to identify a plant or plant cell transformant containing the 
20 gene constructs of the present invention. These methods 
include but are not limited to: 1) Southern analysis or PGR 
amplification for detecting and determining the structure of 
the recombinant DNA insert; 2) Northern blot, S-1 RNase 
protection, primer-extension or reverse transcriptase-PCR 
25 amplification for detecting and examining RNA transcripts of 
the gene constructs; 3) enzymatic assays for detecting enzyme 
or ribozyme activity, where such gene products are encoded by 
the gene construct; 4) protein gel electrophoresis, western 
blot techniques, inmtinoprecipitation, or enzyme-linked 
30 immunoassays, where the gene construct products are proteins; 
5) biochemical measurements of compounds produced as a 
consequence of the expression of the introduced gene 
constructs. Additional techniques, such as In situ 
hybridization, enzyme staining, and immunostaining, may also 
35 be used to detect the presence or expression of the 

recombinant construct in sp cific plant organs and tissues. 
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The m thods for doing all these assays are w 11 known to 
those skilled in th art. 

5.3.1. TRANSGENIC PLANTS THAT ECTOPICALLY 
EXPRESS SCR 

In accordance to the present invention, a plant 
that expresses a recombinant SCR gene may be engineered by 
transforming a plant cell with a gene construct comprising a 
plant promoter operably associated with a sequence encoding 
SCR protein or a fragment thereof. (Operably associated is 
used herein to mean that transcription controlled by the 
^^associated** promoter would produce a functional messenger 
RNA, whose translation would produce the enzyme.) The plant 
promoter may be constitutive or inducible. Useful 
constitutive promoters include, but are not limited to, the 
CaMV 35S promoter, the T-DNA mannopine synthetase promoter, 
and their various derivatives. Useful inducible promoters 
include but are not limited to the promoters of ribulose 
bisphosphate carboxylase (RUBISCO) genes, chlorophyll a/b 
binding protein (CAB) genes, heat shock genes, the defense 
responsive gene (e.g., phenylalanine ammonia lyase genes), 
wound induced genes (e.g., hydroxyproline rich cell wall 
protein genes), chemically-inducible genes (e.g., nitrate 
reductase genes, gluconase genes, chitinase genes, PR-l genes 
etc.), dark- inducible genes (e.g., asparagine synthetase gene 
(Coruzzi and Tsai, U.S. Patent 5,256,558, October 26, 1993, 
Gene Encoding Plant Asparagine Synthetase) developmentally 
regulated genes (e.g.. Shoot Meristemless gene) to name just 
a few. 

In yet another embodiment of the present invention, 
it may be advantageous to transform a plant with a gene 
construct operably linking a modified or artificial promoter 
to a sequence encoding SCR protein or a fragment thereof. 
Typically, such promoters, constructed by recombining 
structural elements of different promoters, have unique 

xpr ssi n patterns and/or levels n t found in natural 
promot rs. Sb ^ e.g., Salina et al., 1992, Plant Cell 
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4:1485-1493, for examples f artificial pr mot rs constructed 
•from combining cis-regulat ry lements with a promot r core. 

In a pr f erred mbodim nt of th present invent i n, 
the associated promoter is a strong and root, root nodule, 
5 stem and/ or embryo-specific plant promoter such that the SCR 
protein is overexpressed in the transgenic plant. Bxaonples 
of root- and root nodules-specific promoters include but are 
not limited to the promoters of SCR genes, SHR genes, 
legehemoglobin genes, nodulin genes and root-specific 
XO glutamine synthetase genes (See e.g., Tingey et al., 1987, 
EMBO J. 6:1-9; Edvards et al., 1990, Proc. Nat. Acad. Sci. 
USA 87:3459-3463) . 

In yet another preferred embodiment of the present 
invention, the overexpression of SCR protein in roots may be 
15 engineered by increasing the copy number of the SCR gene. 
One approach to producing such transgenic plants is to 
transform with nucleic acid constructs that contain multiple 
copies of the complete SCR gene (i.e., with its own native 
scr promoter) . Another approach is repeatedly transform 
20 successive generations of a plant line with one or more 

copies of the complete SCR gene. Yet another approach is to 
place a complete SCR gene in a nucleic acid construct 
containing an amplification-selectable marker (ASM) gene such 
as the glut2uaine synthetase or dihydrof olate reductase gene. 
25 Cells transformed with such constructs is subjected to 
culturing regimes that select cell lines with increased 
copies of complete SCJ? gene. See, e.g., Donn et al., 1984, 
J. Mol. Appl. Genet. 2:549-562, for a selection protocol used 
to isolate of a plant cell line containing aunplified copies 
30 of the GS gene. Because the desired gene is closely linked 
to the ASM, cell lines that amplified the ASM gene are also 
likely to have amplified the SCR gene. Cell lines with 
amplified copies of the SCR gene can then be regenerated into 
transgenic plants. 
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5.3.2. TRANSGENIC PIANTS THAT SUPPRESS 
ENDOGENOUS SCR EXPRESSION 

In accordance with th pr sent invention, a desir d 
plant may be engineered by suppressing SCR activity • In one 
5 embodiment, the suppression may be engineered by transforming 
a plant with a gene construct encoding an antisense RNA or 
ribozyme complementary to a segment or the whole of SCR RNA 
transcript, including the mature target mRNA. In another 
embodiment, SCR gene suppression may be engineered by 

10 transforming a plant cell with a gene construct encoding a 
ribozyme that cleaves the SCR mRNA transcript. 
Alternatively, the plant can be engineered, e.g., via 
targeted homologous recombination to inactive or "knock-out" 
expression of the plant's endogenous SCR. 

15 For all of the aforementioned suppression 

constructs, it is preferred that such gene constructs express 
specifically in the root, root nodule, stem and/or embryo 
tissues. Alternatively, it may be preferred to have the 
suppression constructs expressed const itutively. Thus, 

20 constitutive promoters, such as the nopaline, CaMV 35S 
promoter, may also be used to express the suppression 
constructs. A most preferred promoter for these suppression 
constructs is a SCR or SHR promoter. 

In accordance with the present invention, desired 

25 plants with suppressed target gene expression may also be 

engineered by transforming a plant cell with a co-suppression 
construct. A co-suppression construct comprises a functional 
promoter operatively associated with a complete or partial 
SCR gene sequence. It is preferred that the operatively 

30 associated promoter be a strong, constitutive promoter, such 
as the CaMV 35S promoter. Alternatively, the co-suppression 
construct promoter can be one that expresses with the same 
tissue and developmental specificity as the scr gene. 

According to the present invention, it is preferred 

35 that the co-suppressi n construct ncodes a incomplete SCR 
mRNA, although a construct encoding a fully functional SCR 
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nRNA or enzyme nay also be us ful in effect:in9 co* 
' suppr es s ion • 

In accordance with th pres nt inv ntion, d sir d 
plants with suppressed target gene expression may also be 
5 engineered by transforming a plant cell with a construct that 
can effect site-directed mutagenesis of the SCR gene. (See, 
e.g., Offringa et al., 1990, EMBO J. 9:3077-84; and KanevsXii 
et al., 1990, Dokl« AXad. Nauk. SSSR 312:1505-1507) for 
discussions of nucleic constructs for effecting site-directed 
10 mutagenesis of target genes in plants «) It is preferred that 
such constructs effect suppression of SCR gene by replacing 
the endogenous SCR gene sequence through homologous 
recombination with none or inactive SCR protein coding 
sequence • 

15 

5.3*3. TRANSGENIC PLANTS THAT EXPRESS A 

TRANSGENE CON TROIAED BY THE SCR PROMOTER 

In accordance with the present invention, a desired 
plant may be engineered to express a gene of interest under 
20 the control of the SCR promoter • SCR promoters and 

functional portions thereof refer to regions of the nucleic 
acid sequence which are capable of promoting tissue-specific 
transcription of an operably linked gene of interest in the 
embryo, stem, root nodule and/or root of a plant. The SCR 
25 promoter described herein refers to the regulatory elements 
of SCR genes as described in Section 5.2. 

Genes that may be beneficially expressed in the 
roots and/or root nodules of plants include genes involved in 
nitrogen fixation or cytokines or auxins, or genes which 
30 regulate growth, or growth of roots. In addition, genes 
encoding proteins that confer on plants herbicide, salt, or 
pest resistance may bi engineered for root specific 
expression. The nutricional value of root crops may also be 
enhanced through SCR promoter driven expression of 
3S nutritional proteins. Alt rnatively, therapeutically useful 
prot ins may b expr ssed specifically in r ot crops. 
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Genes that nay be beneficially xpressed in th 
stems f plants include thos inv Ived in starch lignin or 
cellul se biosynthesis. 

In accordance with the present invention, desired 
plants which express a heterologous gene of interest under 
the control of the SCR promoter may be engineered by 
transforming a plant cell with SCR promoter driven constructs 
using those techniques described in Section 5.2.2. and 5.3., 
supra . 



10 



5.3.4. SCREENING OF TRANSFORMED PLANTS FOR THOSE 
HAVING DESIRED ALTB RED TRATTS 

It will be recognized by those skilled in the art 

that in order to obtain transgenic plants having the desired 

engineered traits, screening of transformed plants (i.e., 

those having an gene construct of the invention) having those 

traits may be required. For example, where the plants have 

been engineered for ectopic over express ion of SCR gene, 

transformed plants are examined for those expressing the SCR 

2Q gene at the desired level and in the desired tissues and 

developmental stages. Where the plants have been engineered 
for suppression of the SCR gene product, transformed plants 
are examined for those expressing the SCR gene product (e.g., 
RHA or protein) at reduced levels in various tissues. The 

25 plants exhibiting the desired physiological changes, e.g., 
ectopic SCR overexpression or SCR suppression, may then be 
subsequently screened for those plants that have the desired 
structural changes at the plant level (e.g., transgenic 
plants with overexpression or suppression of SCR gene having 

30 ^® desired altered root structure) . The same principle 

applies to obtaining transgenic plants having tissue-specific 
expression of a heterologous gene in embryos and/or roots by 
the use of a SCR promoter driven expression construct. 

Alternatively, the transformed plants may be 

35 directly screened for those exhibiting the desir d structural 
and functi nal changes. In one mbodiment, such scr ening 
may be for the size, length or pattern of the root of the 
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transfomed plants. In an ther enbodin nt, the screening of 
th transformed plants nay be f r altered gravitropism or 
decreased susceptibility to lodging. In other embodiments, 
the screening of the transformed plants may be for improved 
5 agronomic characteristics (e.g., faster growth, greater 
vegetative or reproductive yields, or improved protein 
contents, etc»), as compared to unengineered progenitor 
plants, when cultivated under various growth conditions 
(e.g., soils or media containing different amount of 
10 nutrients, water content) • 

According to the present invention, plants 
engineered with SCR overexpression may exhibit improved 
vigorous growth characteristics when cultivated under 
conditions where large and thicker roots are advantageous. 
15 Plants engineered for SCR suppression may exhibit improved 
vigorous growth characteristics when cultivated under 
conditions where thinner roots are advantageous. 

Engineered plants and plant lines possessing such 
improved agronomic characteristics may be identified by 
20 examining any of following parameters: 1) the rate of growth, 
measured in terms of rate of increase in fresh or dry weight; 
2) vegetative yield of the mature plant, in terms of fresh or 
dry weight; 3) the seed or fruit yield; 4) the seed or fruit 
weight; 5) the total nitrogen content of the plant; 6) the 
25 total nitrogen content of the fruit or seed; 7) the free 
amino acid content of the plant; 8) the free amino acid 
content of the fruit or seed; 9) the total protein content of 
the plant; and 10) the total protein content of the fruit or 
seed. The procedures and methods for examining these 
30 parameters are well known to those skilled in the art. 

According to the present invention, a desired plant 
is one that exhibits improvement over the control plant 
(i.e., progenitor plant) in one or more of the aforementioned 
parameters. In an embodiment, a desired plant is one that 
35 shows at 1 ast 5% increase over th control plant in at least 
on parameter. In a pref rred embodiment, a desired plant is 
ne that shows at least 20% increase over the control plant 
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In at least one param ter. Most preferred is a plant that 
shows at least 50% increase in at least one paramet r. 

6. EXAMPLE ir ARABIDOPSIS SCR GENE 
5 This example describes the cloning and structure of 

the Arabidopsis SCR gene and its expression. The deduced 
amino acid sequence of the Arabidopsis SCR gene product 
contains a number of potential functional domains similar to 
those found in transcription factors. Closely related 

10 sequences have been fpund in both dicots and monocots 

indicating that Arabidopsis SCR is a member of a new protein 
family. The expression pattern of the SCR gene was 
characterized by means of in situ hybridization and by an 
enhancer trap insertion upstream of the SCR gene (described 

15 in more detail in Section 7 ) . The expression pattern is 

consistent with a key role for Arabidopsis SCR in regulating 
the asymmetric division of the cortex/endodermis initial 
which is essential for generating the radial organization of 
the root. 

20 

6.1. MATERIALS AMD METHODS 

6.1.1. PLANT CULTURE 
Arabidopsis ecotypes Wassilewskija (Ws) , Columbia 
(Col), and Landsberg erecta (Ler) were obtained from Lehle. 
25 Arabidopsis seeds were surface sterilized and grown as 
described previously (Benfey et al., 1993, Development 
119:57-70). Generation of the enhancer trap lines is 
described in Section 7. 



6.1.2. GENETIC AMALYSTS 

For the scr-i allele, co-segregation of the mutant 
phenotype and kanamycin resistance conferred by the inserted 
T-DNA was determined as described previously (Aeschbacher et 
al., 1995, Genes & Development 9:330-340). Because kanamycin 
affects root growth, 1557 seeds fr m heterozygous lines were 
germinated on non-s lectiv media, scored for th appearance 
of the mutant phenotype, and subsequently transferred to 
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sel ctive m dia. All (284) phenotyplcally mutant se dlings 
show d resistance to th antibi tic, wher as 834 of 1273 
phenotyplcally wild-type b dlings showed resistance to 
kanamycin, respectively. Phenotypically wild type plants 
5 (83) were also transferred to soil and allowed to set seeds. 
The progeny of these plants were plated on selective and non- 
selective media, and scored for the co-segregation of the 
mutant phenotype and antibiotic resistance. A majority (48) 
of the plants segregated for the mutant phenotype and for 
10 kanamycin resistance, whereas 35 were wild-type and sensitive 
to kanamycin. Due to a mis-identified cross, scr-2 was 
originally thought to be non-allelic and was named pinocchio 
(Scheres et al., 1995, Development 121:53-62). Subsequent 
mapping results placed it in an identical chromosomal 
15 location as scr^l. The original scr-2 line contained at 

least two T-DNA inserts. Co-segregation analysis revealed a 
lack of linkage between the antibiotic resistance marker 
carried by the T-DNA and the mutant phenotype. Antibiotic 
sensitive lines were identified that segregated for mutants. 
20 These lines were crossed to scr~l. All Fl antibiotic 
resistant progeny exhibited a mutant phenotype. All F2 
progeny (from independent lines) were mutant, and there was a 
3:1 segregation for antibiotic resistance indicating that the 
two mutations were allelic. Antibiotic sensitive lines of 
25 scr'-2 were found to contain a rearranged T-DNA insert as 
determined by Southern blots and PCR using T-DNA specific 
probes and primers respectively. The presence of this T-DMA 
in the SCR gene was confirmed by Southern blots using SCR 
probes. A combination of T-DNA and SCR specific primers was 
30 used to amplify T-DNA/SCl? junctions. The PCR fragments were 
cloned using the TA cloning kit (Invitrogen) and sequenced. 
The insertion points were determined for both 5' and 3' T- 
DUA/SCR junctions. 

35 
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6.1.3. MAPPING 

Mutant plants f scr'2 (WS background) were cross d 
to Col WT. DNA from mutant F2 individual plants wer 
analyzed for co-segregation with aicrosatellite (Bell & 
5 Ecker, 1994, Genonics 18:137-144) and CAPS Barkers (Konieczny 
& Ausubel, 1993, Plant J. 4:403-410). The closest linkage 
was found to two CAPS markers located at the bottom of 
chromosome ZZX. Only one out of 238 mutant chromosomes was 
recombinant for the BGLl marker (Konieczny & Ausubel, 1993, 

10 Plant J. 4:403-410) and one out of 210 chromosomes was 
recombinant for the cdc2b marker. 

A RFLP for the SCR gene was identified between Col 
and Ler ecotypes with Xho I endonuclease. Genomic DNAs from 
independent Rl lines (Jarvis et al., 1994, Plant Mol. Biol. 

15 24:685-687) were digested with Xho I and blots were 

hybridized to SCR. Using the segregation data obtained for 
25 Rl lines, the SCR gene was mapped relative to molecular 
markers by CLUSTER. The SCR gene was assigned to the bottom 
of chromosome III closest to BGLl. 

20 

6.1-4. PHgNOTYPIC AHAj^YgljS 

Morphological characterization of the mutant roots 
was performed as follows: 7 to 14 days post-germination 
phenotypically mutant seedlings were fixed in 4.0% 

25 formaldehyde in PIPES buffer pH 7.2. After fixation the 

samples were dehydrated in ethanol followed by infiltration 
with Kistoresin (Jung-Leica, Heidelberg, Germany) . Plastic 
sections were mounted on superfrost slides (Fisher) . The 
sections were either stained with 0.05% toluidine blue and 

30 photographed using Kodak 160T film or used for Casparian 
strip detection or antibody staining. 

Casparian strip detection was performed as 
described previously (Scheres et al., 1995, Development 
121:53-62), with the following modifications. Plastic 

35 sections w re us d and the c unterstaining was done in 0.1% 
aniline blu for 5 to 15 min. The sections wer visualized 
with a Leitz fluor sc nt microscope with FITC filter. 



wo 97/41152 



PCT/US97/07022 



Pictur s w r talc n using a Leitz cam ra attached to the 
microscope and Kodak HC400 film. Slides were digitized with 
a Nikon slide scanner and. manipulated in Adobe Photoshop. 

For antibody staining, sections were blocked for 2 
5 hours at room tea^erature in 1% BSA in PBS containing 0.1% 
Tween 20 (PET) . Samples were incubated with primary 
antibodies at 4* C in 1% BSA in PET overnight, and then 
washed 3 times 5 minutes each with PET. Samples were 
incubated for two hours with biotinylated secondary 

10 antibodies (Vector Laboratories) in PET, and washed as above. 
Samples were incubated with Texas Red conjugated avidin D for 
2 hours at room temperature, washed as before, and mounted in 
Citifluor. Immunofluorescence was observed with a 
fluorescent microscope equipped with a Rhodamine filter. 

15 Staining with the CCRC antibodies was performed as described 
previously (Freshour et al., 1996, Plant Physiol. 110:1413- 
1429) . 

6.1.5. MQLECUIAR TECHWIOUES 

20 Genomic DNA preparation was performed using the 

Elu-Quik kit (Schleicher & Schuell) protocol. Radioactive 
and non-radioactive DNA probes were labeled with either 
random primed labeling or pCR-mediated synthesis according to 
the Genius kit manual (Boehringer Mannheim) . E. coli and 
25 Agrobacterium tumefaciens cells were transformed using a BIO- 
RAD gene pulser. Plasmid DNA was purified using the alkaline 
lysis method (Maniatis et al., Moleculair Cloning: A 
Laboratory Manual, Cold Spring Harbor, New York: Cold spring 
Harbor Laboratory, 1982). 
30 A probe made from a rescued fragment of 1-2 kb was 

used to screen a wild-type genomic library made from WS 
plants. One genomic clone containing an insert of 
approximately 23 kb was isolated. A 3.0 kb Sac I fragment 
from the genomic clone, which hybridized to the 1.2 kb probe, 
35 was subclon d and secpienced (FIG. 5A) . Comparison of the 

nucl otide s qu nc betw n th g nomic clone and th rescu d 
plasmid r v aled the site of th T-DNA insertion. 
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Appr ximately 600,000 plaques from a cDNA library, obtained 
■ from inflorescences and siliques (Col ecotype) , and ther fore 
enriched in enbryos, were screened with the 1.2 kb probe. 
Four cONA clones were isolated. The dideoxy sequencing 
5 method was performed using the Sequenase kit (United States 
Biochemical Corp.). Sequence-specific internal primers were 
synthesized and used to sequence the Sac I genomic as well 
the CDNA clones. Total RNA from plant tissues was obtained 
using phenol /chloroform extractions as described in (Berry et 
10 al., 1985, Mol. Cell. Biol. 5:2238-2246) with minor 

modifications. Northern hybridization and detection were 
performed according to the Genius kit manual (Boehringer 
Mannheim) . 

To identify the site of insertion of the enhancer- 
15 trap T-DNA, genomic DNA from ET199 homozygous plants was 
amplified using primers specific for the T-DNA left border 
and the SCR gene. An approximately 2.0 kb fragment was 
amplified. This fragment was sequenced and the site of 
insertion was found to be approximately l kb from the ATG 
20 start codon. 



6.1.6. JJy STTU HYBRIDIZATTQM 

Antisense and sense SCR riboprobes were labeled 
with digoxigenin-ll-UTP (Boehringer Mannheim) using T7 
25 polymerase following the manufacturer's protocol. Probes 
contained a l.i kb 3' portion of the cDNA. Probe 
purification, hydrolysis and quantification were performed as 
described in the Boehringer Mannheim Genius System user's 
guide. 

*® Tissue samples were fixed in 4 % formaldehyde 

overnight at 4»C and rinsed two times in PBS (Jackson et al., 
1991, PI. cell 3:115-125). They were subsequently pre- 
embedded in i % agarose in PBS. The fixed tissue was 
dehydrated in ethanol, cleared in Hemo-De (Fisher Scientific, 

35 Pittsburgh, PA) and emb dd d in ParaplastPlus (Fisher 

Sci ntif ic) . Tissue s ctl ns (lOMm thick) w re mounted on 
Superfr stPlus slides (Fish r Scientific) . s ction 



- 59 - 



wo 97/41152 



PCTAJS97/07022 



pretreatinent: and hybridization were p rformed according to 
(Lincoln et al,, 1994, Plant Cell 6:1859-1876) xc pt that 
proteinas K was used at 30 mg/nl and a two h ur 
prehybridization step vas included. Probe concentration of 
5 50 ng/ml/kb was used in the hybridization. 

Slides were washed and the immunological detection 
was performed according to (Coen et al., 1990, Cell 63:1311- 
1322) with the following modifications. Slides were first 
washed 5 h in 5xSSC, 50% formamide. After RMase treatment 

10 slides were rinsed three times (20 min each) in the buffer 
(0.5 M NaCl, 10 mM Tris-HCl pH 8.0, 5.0 mM EDTA) . In the 
immunological detection, antibody was diluted 1:1000, 
levamisole (240 ng/ml) was included in the detection buffer, 
and after stopping the reaction in 10 mM Tris, 1 mM EDTA, 

15 sections were mounted directly to Aqua-Poly /Mount 
(Polysciences, Warrington, PA) . 

6.2. RESUI.TS 

6.2. 1. CHARACTERIZATION OF THE SCR PHENOTYPE 

20 The scarecrow mutant scr-1 was isolated in a screen 

of T-DNA transformed Arabidopsis lines (Feldmann, K.A. , 1991, 
Plant J. 1:71-82), as a seedling with greatly reduced root 
length compared to wild-type (Scheres et al., 1995, 
Development 121:53-62) • A second mutant scr-2 with a similar 

25 phenotype was subsequently identified among T-DNA transformed 
lines. Analysis of co-segregation between the mutant 
phenotype and antibiotic resistance carried by the T-DNA 
indicated tight linkage for scr-1 and no linkage for scr-2 
(see Experimental Procedures) • An antibiotic sensitive line 

30 of scr-2 was isolated and crossed with scr-1. The F2 progeny 
of this cross were all mutant and segregated 3:1 for 
antibiotic resistance confirming allelism (see Materials & 
Methods) • The principal phenotypic difference between the 
two alleles was that scr-^l root growth was more retarded than 

35 that of seir-2, suggesting that it is the stronger allele 
(FIG. 2A) . F r both all les th a rial organs appeared 
similar t wild-type and th flowers were fertil (FIGS. 2A 
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and 2B) . The progeny of backer sees of scr~l r scr-2 to 
• wild-type plants segregated 3:1 for the root phenotype f r 
both allel s, indicating that each nutation is monogenic and 
recessive. 

5 Analysis of transverse sections through the primary 

root of seedlings revealed only a single cell layer between 
the epidermis and the pericycle (FIG. 2C) instead of the 
normal radial organization consisting of cortex and 
endodermis (FIG. 2D) . This radial organization defect was 
10 not limited to the primary root, but was also present in 

secondary roots (FIG. 2E) and in roots regenerated from calli 
(FIG. 2F) . Occasionally defects were observed in the number 
of cells in the remaining cell layer (more than the invariant 
8 found in wild-type) . Abnormal placement or numbers of 
X5 epidermal cells were also observed (see FIG. 2E) . These 

abnormalities were more frequently observed in sex— 1 than in 
scr-2. Nevertheless, organization of the mutant root closely 
resembles that of wild-type except for the consistent 
reduction in the number of cell layers. Because the 
20 endodermis and cortex are normally generated by an asymmetric 
division of the cortex/ endodermal initial, this indicates 
that the primary defect in scr is disruption of this 
asymmetric division. 

It has been shown that the radial organization 
25 defect in scr-1 first appears in the developing embryo at the 
early torpedo stage and manifests itself as a failure of the 
embryonic ground tissue to undergo the asymmetric division 
into cortex and endodermis (Scheres et al., 1995, Development 
121:53-62). This defect extends the length of the embryonic 
30 axis which encompasses the embryonic root and hypocotyl. 

Other embryonic tissues appear similar to wild-type (Scheres 
et al., 1995, Development 121:53-62). In seedling hypocotyls 
of the scarecrow phenotype, two cell layers instead of the 
normal three layers (two cortex and one endodermis) between 
35 epidermis and stel were found. This w uld be the exp cted 
result f th lack f the divisi n of th embryonic ground 
tissue. Similar results were btained for scr-2. Henc 
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t:his mut:ant identifies a gene involved in tJie asymmetric 
- division that pr duo s cortex and ndod miis from ground 
tissue in the embryonic r t and hypocotyl and from the 
cortex/ endodermal initials in primary and secondary roots. 

5 

6.2.2« CHARACTERIZATION OF CELL IDENTITY IN SCR 
ROOTS 

To tinder stand the role of the Arabidopsis SCR gene 
in regulating this asymmetric division, it was necessary to 
determine the identity of the mutant cell layer. Tissue- 
specific markers were used to distinguish between several 
possibilities. The cell layer could have differentiated 
attributes of either cortex or endodermis. Alternatively, it 
could have an undifferentiated, initial-cell identity or it 
could have a chimeric identity with differentiated attributes 
of both endodermis and cortex in the same cell. 

Transverse sections of and scr~2 roots were 

assayed for the presence of tissue-specific markers. The 
Caspar ian strip, a deposition of suberin between radial cell 
2^ walls, is specific to the endodermal cells and is believed to 
act as a barrier to the entry of solutes into the vasculature 
(Esau, K. Anatomy of Seed Plants, New York: John Wiley & 
Sons, 1977, Ed. 2, pp- 1-550). Histochemical staining 
revealed the presence of a casparian strip in the mutant cell 
2g layer (FIG. 3A, compare to wild-type, FIG. 3B) • It is noted 
that in the vascular cylinder, this histochemical stain also 
reveals the presence of lignin, indicating the presence of 
differentiated xylem cells in mutant (FIG. 3A) and wild-type 
(FIG. 3B) . Another marker of the differentiated endodermis 
is the arabinogalactan epitope recognized by the monoclonal 
antibody, JIH13 (Knox et al., 1990, Planta 181:512-521). The 
mutant cell layer showed staining with this antibody 
(FIG. 3C, compare with wild-type, FIG. 3B) . As a positive 
control, the JIM7 antibody that recognizes pectin epitopes in 
all cell walls was used (FIGS. 3E and 3F) . These results 
indicat that the cell lay r between th epid rmis and th 
pericycle has differentiated attributes of the endodermis. 
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As a marker for th cortex, the CCRC-M2 monoclonal 
antibody was used. This antibody recogniz s a c 11 wall 
oligosaccharide pitop , f und only n differentiated cortex 
and epidemis cells. In sections from the differentiation 
5 zone of scr-1 and scr-2, both cortex and epidermal cells 
showed staining (FIG. 4A and 4B) that was similar to that of 
wild-type (FIG. 4C) . In scr-1, staining of both cell types 
was apparent, but staining of cortex was somewhat weaker than 
wild-type. The positive control used the CCRC-Ml monoclonal 

10 antibody which recognizes an oligosaccharide epitope found on 
all cells (FIGS. 4D-F) . 

With the CCRC-M2 antibody an interesting difference 
was observed between the staining pattern of the mutants as 
compared to wild-type. The appearance of this epitope 

15 correlates with differentiation in these two cell types. 
Normally, in sections close to the root tip there is no 
staining, in sections higher up in the root, atrichoblasts 
(epidermal cells that do not make root hairs) stain. In 
sections from more mature root tissue, all epidermal cells as 

20 well as cortex cells stain for this epitope. in both scr~i 
and scr~2, sections could be found in which all epidermal 
cells stained while there was little detectable staining of 
cortex cells. Although not precisely identical to the wild- 
type staining pattern, the fact that the mutant cell layer 

25 clearly stains for this cortex marker indicates that there 
are cortex differentiated attributes expressed in these 
cells. 

Taken together, these results indicate that the 
mutant cell layer has differentiated attributes of both the 

30 endoderais and cortex. The possibility that there has been a 
simple deletion of a cell type, or that the resulting cell 
type remains in an undifferentiated initial-like stage can be 
ruled out. This result is consistent with a role for the scr 
gene in regulating this asymmetric division rather than a 

35 role in directing c 11 sp cification. 
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6.2.3. MOLECUIAR CLONING OF THE SCR GENE 

To furth r lucidat th function of th 
Arabid pais SCR gen th insert d T-DNA 8 guences were used 
to clone the gene. Plant DNA flanking the insertion site was 
5 obtained from scr-1 by plasmid rescue and used to isolate the 
corresponding wild-type genomic DNA. Several cDNA clones 
were isolated from a library made from silicpie tissue. 
Compsorison of the seguence of the longest cDNA and the 
corresponding genomic region revealed an open reading frame 

10 (ORF) interrupted by a single small intron. (FIG. 5A) • A 
potential TATA box and polyadenylation signal that matched 
the consensus seguences for plant genes were also identified 
(Joshi, CP., 19B7, Nucl. Acids Res. 15:6643-6653); Heidecker 
& Messing, 1986, Ann. Rev. Pl^nt Physiol. 37:439-466); Hogen 

15 et al., 1990, Plant Cell 2:1261-1272). 

Comparison of the nucleotide seguence between the 
genomic clone and the rescued plasmid placed the site of the 
T-DNA insertion in scr-I at codon 470 (FIGS. 5A and 5B) . For 
scr^2, although no linkage was found between the mutant 

20 phenotype and antibiotic resistance, DNA blot and PGR 

analysis of antibiotic sensitive lines revealed the presence 
of T-DNA seguences that co-segregated with the mutant 
phenotype. The insertion position in scr-2 was determined by 
cloning and seguencing the PGR products amplified from its 

25 genomic DNA using a combination of T-DNA and SCR specific 
primers at both sides of the insertion (FIG. 5B) . In scr-2 
the T-DNA insertion point is at codon 605 (FIG. 5A and 5B) . 
To verify linkage between the cloned gene and the mutant 
phenotype, we identified the chromosomal location of both the 

30 scr locus and the SCR gene. To map the scr locus, molecular 
markers were used on F2 progeny of crosses between scr-2 
(ecotype Wassilewskija, Ws) and Colombia (Col) WT. These 
placed the scr locus at the bottom of chromosome III, 
approximately 0.5 cM away from each of the two closest 

35 markers available, cdc2b and BGLl (Konieczny and Ausubel, 
1993, Plant J. 4:403-410). To map th SCR g ne, we 
identified a polymorphism betw en Col and Landsberg (Ler) 
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cotypes using the SCR probe b (FIG. 5B) . South rn analysis 
- of 25 recombinant inbred lin s (Jarvis t al., 1994, Plant 
Mol. Biol. 24:685-687) mapped th cl n d gene to the sam 
location as the SCR locus on chromosome III. 
5 The determination of the molecular defects in two 

independent alleles and the co- localization of the cloned 
gene and the mutant locus confirms that we have identified 
the SCR gene. 



10 6.2.4. THE SCR GEHE HAS MOTIFS THAT INDICATE IT 

IS A TRAKSCRIPTI OW FACTOR 

The Arabidopsis SCR gene product is a 653 amino 
acid polypeptide that contains several domains (FIG. SB) . 
The amino-terminus has homopolymeric stretches of glutamine, 

15 serine, threonine, and proline residues, which account for 
44% of the first 267 residues. Domains rich in these 
residues have been shown to activate transcription and may 
serve such a role in SCR (Johnson et al. , 1993, J. Nutr. 
Biochem 4:386-398). A charged region between residues 265 

20 and 283 has similarity to the basic domain of the bZIP family 
of transcriptional regulatory proteins (FIG. 5C) (Hurst, 
H.C., 1994, Protein Profile 1:123-168). The basic domains 
from several bZIP proteins have been shown to act as nuclear 
localization signals (Varagona et al., 1992, Plant Cell 

25 4:1213-1227), and this region in SCR may act similarly. This 
charged region is followed by a leucine heptad repeat 
(residues 291-322). A second leucine heptad repeat is fotmd 
toward the carboxy-terminus (residues 436 to 473). As 
leucine heptad repeats have been demonstrated to mediate 

30 protein-protein interactions in other proteins (Hurst, H.C., 
1994, Protein Profile 1:123-168), the existence of these 
motifs suggests that SCR may function as a dimer or a 
multimer. The second leucine heptad repeat is followed by a 
small region rich in acidic residues, also present in a 

35 number of defin d transcriptional activation d mains (Johns n 
et al., 1993, J. Nutr Biochem 4:386-398). While each of 
these domains has been found within proteins that do not act 
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as transcriptional regulat rs, th fact that all of then are 
.found within the deduced SCR protein sequence indicates that 
SCR is a transcriptional regulatozy prot in. 

5 6.2.5. SCR IS A MEMBER OF A NOVEL PR OTEIN FAMILY 

The Arabidopsis SCR protein sequence was compared 
with the sequences in the available databases. Eleven 
expressed sequence tags (ESTs) , nine from Arabidopsis, one 
from rice and one from maize, showed significant similarity 

10 to residues 394 to 435 of the SCR sequence, a region 
inunediately amino-terminal to the second leucine heptad 
repeat (FIGS. 15K-L) . This region is designated the VHIID 
domain. Subsequent analysis of these EST secjuences has 
revealed that the sequence similarity extends beyond this 

15 region; in fact, the similarity extends throughout the entire 
known gene products. The combination and order of the motifs 
found in these sequences do not show significant similarity 
to the general structiures of other established regulatory 
protein families (i.e., bZIP, zinc finger, MAOS-domain, and 

20 homeodomain) , indicating that the SCR proteins comprise a 
novel family. 



6.2.6. SCR IS EXPRESSED IN THE CORTEX /ENDODERMAL 

INXTIAI^ AWP ?:n TRP EyPQPgpHIs 

RNA blot analysis revealed expression of SCR in 
Arabidopsis siliG[ueS| leaves and roots of wild-type plants 
(FIG. 6A) . No hybridization was detected to RNA from scr-I 
plants (FIG. 6B, lane 2) • This indicates that scr-1 has a 
reduced level of RNA expression and may represent the null 
phenotype. Hybridization to RNA species larger than the 
normal size were detected in scr-2. This indicates that 
abnormal SCjR transcripts are made in this allele, suggesting 
that functional but possibly altered proteins may be 
produced . 

To determine if expression was localized to any 
particular c 11 type, RNA in situ was hybridization perf rmed 
on sections of root tissue. In mature roots, expression was 
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localized primarily to the endodemis (FIGS. 7A and 7B) . 
• Expression appear d to start very cl s to or within the 
cortex/ ndodermal initials and continue up the endodemal 
cell file as far as the section extended. Expression was 
5 also detected in late-torpedo stage embryos in the endodermis 
throughout the embryonic axis (FIG. 7C) . Sense strand 
controls showed only background hybridization (FIG. 7D) . 

To determine whether the localization of SCR RNA 
was regulated at the transcriptional or post-transcriptional 
10 level, enhancer trap (BT) lines were prepared and examined in 
which the ^-glucuronidase (uid-A or GUS) coding sequence with 
a minimal promoter was expressed in the root endodermis. 
(See Section 7, infra). Restriction fragment length 
polymorphisms were observed when DNA from one of these lines, 
15 ET199 and wild-type were probed with SCR. PCR and sequence 
analysis confirmed that the enhancer-trap construct had 
inserted approximately 1 kb upstream of the SCR start site 
and in the same orientation as that of SCR transcription. 

In mature roots, expression in ET199 whole mounts 
20 showed a similar pattern to that of the in situ 

hybridizations, with the strongest staining present in 
endodermal cells (PIG. 7E) . Transverse sections indicated 
that expression was primarily in endodermal cells in the 
elongation zone (FIG. 7F) . Longitudinal sections through the 
25 meristematic zone revealed that expression could be detected 
in the cortex/ endodermal initial (FIG. 7G) . of particular 
interest was the restriction of expression to the endodermal 
daughter cell after the periclinal division (FIG. 7G) . This 
indicated that the expression pattern observed in the in situ 
30 analysis was not due to post-transcriptional partitioning of 
SCR RNA. Rather, it suggests that after the periclinal 
division of the cortex/ endodermis initial only one of the two 
cells is able to transcribe SCR RNA. 



35 
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6.3. DISCUSSION 

6.3.1. THE SCR GENE REGULATES AN ASYMMETRIC 
DIVISION REQUIRED FOR ROOT RADIAL 

QRGAyyzATiow 

The formation of the cortex and endodenal layers 
in the Arabidopsis root requires two asymmetric divisions. 
In the first, an anticlinal division of the cortex /endodermal 
initial generates two cells with different developmental 
potentials. One will continue to function as an initial, 
while the other undergoes a periclinal division to generate 
the first cells in the endodermal and cortex cell files. 
This second asymmetric division is eliminated in the 
scarecrow^ mutant, resulting in a single cell layer instead of 
two. The scr mutation appears to have little effect on any 
other cell divisions in the root indicating that it is 
involved in regulating a single asymmetric division in this 
organ. Several other mutations have been characterized that 
appear to affect specific cell division pathways in 
Arabidopsis. These include knolle {kn) in which formation of 
the epidermis is impaired (Lukowitz et al. , 1996, Cell 84:61* 
71) , f^ooden leg {wol) in which vascular cell division is 
defective (Scheres et al., 1995, Development 121:53-62) and 
fass (fs) in which there are supernumerary cortex and 
vascular cells (Scheres et al., 1995, Development 121:53-62); 
Torres Ruiz 6 Jurgens, 1994, Development 120:2967-2978). 
Only in the case of scr and short^root (shr) mutants has it 
been shown that the defect is in a specific asymmetric 
division. 

Mutational analyses in several organisms have 
revealed that the genes that regulate asymmetric divisions 
can be specific to a single type of division or can affect 
divisions that are not clonally related (Horvitz & 
Herskowitz, 1992, Cell 68:237-255). In most cases, these 
mutations result in the formation of two identical daughter 
cells with similar developmental potentials (Horvitz & 
Herskowitz, 1992, Cell 68:237-255). Both resulting c lis 
have the identity of one or th other of the normal daughter 
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cells, an exanple of which is the swi" nutation in S. 
c revisxM (Nasnyth et al. , 1987, Cell 48:579-587). However, 
ther are also xampl s of mutations that result in the 
fomation of chimeric cell types such as the ham-l mutation 
5 in C, Bleqans (Desai et al., 1988, Nature 336:638-646). 

6.3.2. SCR INVOLVEMENT IN CELL 

SPgCIFICATION OR CELL DTVTfiTftj^ 

Genes that regulate asymmetric cell divisions can 
10 be divided into those that specify the differentiated fates 
of the daughter cells and those that function to effect the 
division of the mother cell (Horvitz & Herskowitz, 1992, 
Cell, 68:237-255). The aberrant cell layer formed in the scr 
mutant has differentiated features of both endodermal and 
15 cortex cells. Thus, scr is in the rare class of asymmetric 
division mutants in which a chimeric cell type is created. 
The ability to express differentiated characteristics of 
cortex and endodermal cells implies that the differentiation 
pathways for both these cell types are intact and do not 
20 require the functional SCR gene. This indicates that SCR is 
involved primarily in regulating a specific cell division, 
and that the correct occurrence of this division can be 
unlinked from cell specification. This is in contrast to the 
shr mutant, in which the periclinal division of the 
25 cortex/ endodermal initial also fails to occur and the 

resulting cell lacks endodermal markers (Benfey et al., 1993, 
Development 119:57-70) and has cortex attributes. A genetic' 
analysis was used to address the function of SNR and scR in 
the asymmetric division of the cortex/ endodermal initial. 
30 Placing mutants of each of these genes in a fs mutant 

background asked whether the supernumerary cell divisions 
characteristic of fs were sufficient to restore normal cell 
identities (Scheres et al., 1995, Development 121:53-62). m 
the shr,f8 double mutant there were additional cell layers 
35 but no endodermal. Indicating that th SHR g n has a role in 
specifying c 11 id ntity. In th scr,fs doubl mutant no 
alterati n in c 11 id ntity was observed as compared to rs 
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(Scher s t al., 1995, Dev lopment 121:53-62) • Taken 
' ^ogetiher vit:li cell marker analysis present d herein, 

these results ar c nsist nt with a role for SCR in 
generating the division of the nother cell while the SHR gene 
5 nay be involved in specifying the fate of the endodermal 
daughter . 

6.3.3. A ROLE FOR SCR IM EMBRYONIC DEVELOPMENT 
At least one additional cell division appears to be 
10 affected in the scr nutant. During embryonic development, 
the ground tissue does not divide to form the endodermal and 
cortex layers of the embryonic root and hypocotyl. As shown 
herein, expression of SCR was detected in the endodermal 
tissue throughout the embryonic axis shortly after this 
15 division occurs. Thus, SCR may play a direct role in 
regulating both this division and the division of the 
cortex /endodermal initial in the root apical meristem. 
Alternatively, the radial organization established in the 
embryo may somehow act as a template that directs the 
20 division of the cortex /endodermal initial, thus perpetuating 
the pattern. This is consistent with the finding in the sex- 
mutant that the aberrant pattern established in the embryo is 
perpetuated in the primary root. It is also consistent with 
a recent study in which the daughter cells of the 
25 cortex/ endodermal initial were laser ablated (van den Berg et 
al., 1995, Nature 378:62-65). When a single daughter cell 
was ablated, it was replaced by a cell that followed the 
normal asymmetric division pattern. When three adjacent 
daughter cells were ablated, the central initial divided 
30 anticlinally but failed to perform the periclinal division 

(van den Berg et al,, 1995, Nature 378:62-65). This provided 
evidence that information from mature cells is required for 
the correct division patxern of cortex/ endodermal initials 
suggesting a "top down** transfer of information. However, 
35 the aba nee of a cell lay r in lateral roots and callus- 
d rived r ots of the scr mutant sugg sts that embryo vents 
are not unique in their ability to establish radial 
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organization. Rath r, these bservati ns implicate SCR in 
regulating both embryonic and p st- mbryonic r ot radial 
organization. 

5 6.3.4. TISSUE-SPECIFIC EXPRESSION OF SCR IS 

REGUIATED AT THE TRAMS CRT PTIQNAL T.EVFT. 

Although not intending to be limited to any theory 
or explanation regarding the mechanism of SCR action, the 
cloning of the gene and the expression pattern provide some 
10 clues as to the role of SCR in the regulation of a specific 
asymmetric division. The SCR gene is expressed in the 
cortex/endodermal initial, but immediately after division is 
restricted to the endodermal lineage. A similar pattern is 
seen in the ET199 enhancer trap line in which SCR regulatory 
15 elements are in proximity to a GUS gene, indicating that SCR 
restriction to the endodermal cell file is due to 
differential regulation of expression of the SCR gene in this 
cell and the first cell in the cortex file. Another marker 
line in which expression of GUS is detected only in the 
20 cortex daughter cell provides a control for differential 

degradation of GUS RMA or protein. Thus, partitioning of SCR 
RMA as a means of achieving this segregation of expression 
can be ruled out. what remains to be determined is whether 
this difference in transcriptional activity of the two 
25 daughter cells is due to internal polarity of the mother cell 
prior to division such that cytoplasmic determinants are 
unequally distributed, or to external polarity that 
influences cell fate after division. Since SCR is expressed 
prior to cell division, an attractive hypothesis is that it 
30 is involved in establishing polarity in the cortex/endodermal 
initial. The sequence of the SCR protein strongly suggests 
that it acts as a transcription factor. Hence, it may act to 
regulate the expression of other genes essential for the 
establishment of unequal division. Alternatively, it is 
35 conceivable that it could play a role in creating an ext rnal 
polarity that provid s a signal to divid asymmetrically. 
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Its xpr ssion in mor mature ndodermal c lis is consistent 
with a rol in *top-down* signaling. 

6.3.5. A NEW FAMILY OF TRANSCRIPTIONAL REGULATORS 
5 Analysis of eighteen EST clones foimd in the 

GenBank database reveals that the proteins they encode share 
a high degree of honology with Arabidopsis SCR protein. See 
Table 1 and FIGS. 15A-S. Further sequence analysis of the 
encoded proteins indicate that a high degree of sequence 

10 similarity extends from at least the highly conserved VHIID 
domain to the carboxy*-terminus of the gene products. 
Comparison of the amino termini of these proteins is 
precluded by the fact that the ESTs are incomplete. The high 
degree of similarity among these proteins, in combination 

15 with the motifs observed in the SCR protein (homopolymeric 
motifs, two leucine heptad repeats and a bZIP-like basic 
domain that may also function as a nuclear localization 
sequence) indicates that these proteins form a novel class of 
regulatory proteins. 

20 The insertion sites of the T-DNA in the two scr 

mutant alleles raised the possibility that the mutant 
phenotype was due to the production of truncated proteins. 
Northern blot analysis indicated SCR RNA is undetectable in 
This suggests that the phenotype is either the null, 

25 or due to highly reduced RNA expression. In scr-2, an 

alteration in RNA size was detected which would be consistent 
with the presence of a functional and possibly truncated 
protein. This could provide an explanation for the 
observation that scr-2 appears to be the weaker allele. 

30 

7. EXAMPLE 2: ENHANCER TRAP ANALYSIS OF ROOT DEVELOPMENT 
An enhancer trap system was used in order to 
provide a more detailed molecular analysis of gene expression 
in lateral root patterning and development in Arabidopsis 
35 thaiiaAA. A new collection of marker lines that express /3- 
glucuronidase (GUS) activity in a c 11-type specific manner 
in each of the cells f the root was gen rat d. These lines 
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allow differentiation of c lis to be monitor d based on 
molecular charact ristics. One of th se mark r lines, ET199, 
resulted from th integrati n of th GUS cassette in 
proximity to an SCR enhancer. The results described below 
S demonstrate that transcriptional activation of the SCR gene 
plays an important role in root development in Arabidopsis, 
and that SCR gene transcriptional regulatory elements can 
express a transgene in a developnentally and tissue specific 
manner. 

10 

7.1. MATERIALS AND METHOnfi 

7.1.1. PLANT G ROWTH gQNDITIOWSi 

Arabidopsis seeds from MO-o and Columbia ecotypes 
were sterilized and sown on MS plates containing 4.5% 
15 sucrose. Plates were oriented vertically and maintained 
under 18 hours light, 6 hours dark cycle. 



7.1.2. HISTOLOGY AND GUS fi TATNTMC- 

For observation of lateral roots, roots were 
20 removed from plates and infiltrated in 25% glycerol for 

several hours to overnight. Roots were then mounted in 50% 
glycerol. Whole seedlings were stained for GUS activity for 
up to three days in the following solution: IX GUS buffer, 
20% methanol, 0.5 mg/ml X-Glu. Addition of methanol greatly 
25 improves the specificity and reproducibility of staining. 
Staining solution was made fresh from a lOX buffer (i m Tris 
PH7.5, 290 mg HaCl, 66 mg K,Pe(CN).) that was stored for no 
■ore than one week, stained roots were cleared in glycerol 
and mounted as above. All samples were observed using 
30 Nomarski optics on a Leitz Laborlux s microscope. 

Photographs were taken using a Leitz MPS52 camera, and images 
were scanned into Adobe Photoshop to create figures. m some 
cases the intensity of the blue color was increased. 



35 
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7.1.3. gQWSTROCTIQ N OF ENHANCER TRAP LINES: 
Plant Cloning V ct r (PCV) (Koncz et al., 1994, 
Specialized v ctors for g n tagging and expr ssion studies, 
in Plant Molecular BiQ ^ogy Manual. Gelvin & Schilperoort , 
5 eds. , Vol. B2, pp. 1-2, Kluover Academic Press, Dordrecht, 
The Netherlands) contains a Ban HI site immediately adjacent 
to the T-DNA right border sequence. The ^-glucuronidase gene 
fused to the TATA region (-46 to 78) of the CaMV 35S promoter 
was introduced into this site (Benfey et al., 1990, EMBO J. 
10 9:1677-1684) • 350 transgenic lines were generated by 

Agrobacterlum mediated root transformation (Narton & Browse, 
1991, Plant Cell Reports 10:235-239), and 4 independent lines 
from each transformant were screened for GUS activity in the 
root. 

15 

7.2. RESULTS 

7.2.1- DIFFERENTIATION I N THE LRP 
The marker lines described above reflect patterns 
of gene expression that are specific to individual root cell 
20 types. There are no readily apparent mutant phenotypes in 
any of these lines. Therefore, they can be used to analyze 
the differentiation state of the cells during normal 
development of the lateral root primordial (LRP) . If there 
are stages at which the pericycle cells proliferate in the 
25 absence of patterning, it can be expected that all cells 
would be identical with none expressing differentiated 
characteristics. In contrast, organization of the LRP would 
be reflected in differential patterns of GUS gene expression, 
with certain cells beginning to turn on transcription from 
30 differentiated cell-type specific promoters (i»e., those that 
drive GUS expression in the enhancer trap lines) . 

The process of lateral root formation is divided 
into the following seven stages: 
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Stag I: The LRP is first visible as a set of pericycle 
■ cells that are clearly short r in 1 ngth than their 
neighbors, having undergone a series of anticlinal divisions. 
Laskowslci et al., 1995, Dev. 121:3303-3310 predict that there 
5 are approxinately 4 founder pericycle cells involved. In the 
longitudinal plane, these divisions result in the formation 
of 8-10 snail cells, which enlarge in a radial direction. 

Stage lis a periclinal division occurs that divides the LRP 
10 into two layers (Upper Layer (UL) and Lower Layer (LL) ) . Not 
all the small pericycle-derived cells appear to participate 
in this division — typically the most peripheral cells do 
not divide. Hence, as the UL and LL cells expand radially 
the domed shape of the LRP begins to appear. 

15 

Stage III: The UL divides periclinally, generating a three 
layer primordium comprised of ULl, UL2 and LL. Again, some 
peripheral cells do not divide, creating peripheral regions 
that are one and two cell layers thick. This further 
20 emphasizes the domed shape of the LRP. 

Stage IV: The LL divides periclinally, creating a total of 
four cell layers (ULl, UL2 , LLl , LL2 ) . At this stage the LRP 
has penetrated the parent endodermal layer. 



25 



Stage V: The central cells in LL2 undergo a number of 
divisions that push the overlying layers up and distort the 
cells in LLl. These divisions are difficult to visualize at 
this stage, but clearly form a knot of mitotic activity. The 
30 LRP at this stage is midway through the parent cortex. The 
outer layer contains 10-12 cells. 

Stage VI: This stage is characterized by several events. 
The four central cells of ULl divide periclinally. This 
35 divisi n is particularly useful in identifying th median 
longitudinal plane in th enlarging lrp. At this point 
th re are a total of twelve cells in ULl, four in the middle 
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that hav undergon the p riclinal division and four on 
eith r sid . In addition, all but th most central cells of 
UL2 undergo a per iclinal. division. At this point the LRP has 
passed through the parent cortex layer and has penetrated the 
5 epidermis. The central cells apparently derived from LL2 
have a distinct elongated shape characteristic of vascular 
elements. 

Stage VII: As the primordium enlarges it becomes difficult 
10 to characterize the divisions in the internal layers. 

However, the cells in the outermost layer can still be seen 
very clearly. All of these cells undergo a anticlinal 
division, resulting in 16 central cells (8 cells in each of 
two layers) flanked by 8-10 cells on each side. We refer to 
15 this as the 8-8-8 cell pattern. The liRP appears to be just 
about to emerge from the parent root. 

7.2.2. MARKER LINES 

An enhancer trapping cassette was generated by 
20 fusing the GUS coding sequence to the minimal promoter of the 
3 5S promoter from CsOlV. This minimal promoter does not 
produce a detectable level of GUS expression. However, its 
presence allows other upstream elements to direct GUS 
expression in a developmental and/ or cell-specific manner 
25 (Benfey et al., 1990, EMBO J. 9:1677-1684). The use of a 
minimal promoter instead of a promoter less constiruct allows 
GUS expression to occur even if the enhancer trap cassette 
inserts at a distance from the coding region. Since the 
insert does not have to be within the structural gene, there 
30 are often no mutations generated in the enhancer trap lines. 
The minimal promoter: GUS construct was cloned immediately 
adjacent to the T-DNA right border sequence of PCV (Koncz et 
al., supra) and introduced into Arabidopsis. 350 independent 
lines were generated and analyzed for GUS activity in the 
35 root. The following lines most cl arly defin each c 11 
type. All of the lines wer generated through enhancer 
trapping, as described herein, below, except for CorAX92 



wo 97/41 152 PCT/US97A>7022 

(Dietrich ot al., 1992, Plant Cell 4:1371-1382) and 
EpiGL2:GUS (Masucci et al., D v. 122:1253-1260) which ar 
transgenic plants that contain cell-type sp cific promoters 
fused to the GUS gene. 

5 

Ste05 - expresses GUS in the stele including the pericycle 
layer throughout primary and lateral roots « At the root tip, 
staining becomes weaker in the elongation zone; therefore, it 
is likely that only differentiated stele cells express GUS 
10 activity. Stelar GUS expression is also seen in aerial parts 
of the plant. 

£ndl95 - expresses GUS in the endodermis of primary and 
lateral roots. Staining can be seen most clearly in the 

15 cells in the meristematic region of the root, although 

overstaining shows that more mature cells also express some 
GUS activity. It appears that there is no staining in the 
cortex/ endodermal initial, but staining is evident in the 
first daughter cell of this initial. GUS expression is also 

20 seen at the base of young leaves and in the stipules. 

ET199 - expresses GUS in the endodermis of primary and 
lateral roots, again most clearly in cells in the 
meristematic region. Unlike Bndl95, staining in ET199 
25 appears to continue down to the cortex/ endodermal initial 
and, in younger roots, even into the cells of the quiescent 
center. Expression in the aerial parts of the plant is 
detectable in the young leaf primordia. 

30 CorAX92 - This line was generated by fusing the 5' and 3' 
sequences from a cortex specific gene isolated from oilseed 
rape to the GUS reporter gene (Dietrich et al.. Plant Cell 
4:1371-1382). Expression is limited to the cortex layer, 
extending to but not including the cortex/ endodermal initial. 

35 Staining is also apparent in th petioles and leaf blad s of 
expanded leaves. 
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Epi6L2:6US - This line was g n rated by fusing the GL2 
* promoter to the GUS gene (Masuccl et al. , Dev« 122:1253- 
1260) . Expression is seen in th non-hair forming epidermal 
cells (atrichoblasts) • Staining is seen near the root tip, 
5 but it is difficult to determine if it includes the epidermal 
initial. Staining is also seen in the trichomes, leaf 
primordia, and the epidermis of the hypocotyl and leaf 
petioles. 

XO CRC219 - This line shows staining in the columella root cap 
only. 

IiRC244 - This line shows staining in the lateral root cap 
only. 

15 

RC162 - This line shows staining in both the lateral and 
columella root caps. 

Two marker lines show differential staining at 
20 very early stages of LRP development. One of these, ET199, 
presents a complex and dynamic pattern of expression. 
Staining is first apparent at stage II in only the four 
central cells of the UL. At stage III staining is strongest 
in the central cells of UL2. As the LRP reaches stage V the 
25 staining remains strongest in the central 2-4 cells of UL2. 
By stage VI staining also begins to extend into the newly 
formed endodermal layer, and staining in both the central 
cells and endodermis persists beyond emergence of the lateral 
root. 

30 Another line, LRBIO (lateral root base), does not 

express GUS in the primary root tip. Staining in the LRP is 
seen at stage I, and at stage II all the cells of the UL and 
LL are stained. However, by stage IV and V only the cells at 
the periphery of the LRP are still expressing GUS. As the 

35 LRP dev lops, th se cells continue to stain, although less 
intens ly, resulting in a ring of GUS xpressing cells at the 
base of the LR. 
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LRBIO and ET199 clearly demonstrate non-identity 
betve n the c lis at v ry arly stages, stag IV in the cas • 
of LRBIO and within th UL at stage II in ET199. In 
addition, although it is difficult to identify the nature of 
5 the cells that correspond to the observed staining pattern in 
LRBIO and the early staining cells of ET199, post -emergent 
lateral roots show analogous staining in these lines, 
suggesting that the stained cells are already expressing 
markers that reflect their differentiated cell fates. Hence, 
10 these observations suggest a very early onset of 
differentiation in the cells of the LRP. 

7.2.3. ET199 PROVIDES EVIDENCE FOR THE ROLE OF 
SCR IN PLANT DEVELQPMFWT 

i* Fortuitously, it was discovered that the GUS 

cassette in ET199 described section 7.2.2, above, is situated 
approximately 1 kb upstream from the SCR gene. The SCR cDNA 
was labelled and used to probe genomic DNA from WT and ET199 
plants. The band pattern seen in the Southern was completely 

20 consistent with a T-DNA inserted i kb upstream of the 

putative SCARECROf start site. Subsequently, a DNA fragment 
was PGR amplified using a primer within the T-DNA and a 
primer within SCARECROW. The size of this fragment was also 
consistent with the predicted insertion site. Partial 

25 sequencing of the PGR fragment confirmed the presence of 
SCARECROW sequence. Mutants in the SCR gene are completely 
lacking one of the radial layers between the epidermis and 
pericycle in both primary and lateral roots, due to the 
absence of specific cell division during embryogenesis and of 

30 the cortex/ endodermal initial during post-embryonic growth. 
The expression pattern (described in Section 7.2.2., above) 
that was observed in the central cells of the developing LRP 
of ET199 provide strong evidence that the cells in this 
region are involved in the establishment of the meristematic 

35 initials. More importantly, these results dem nstrate that 
transcriptional activation of the SCR g n plays a major role 
in the d velopment of the Arabidopsis LRP. Furthermore, 
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^hes r sultis d mon&trate ^at a transg n can be expressed 
\inder the control of SCR gene transcriptional regulatory 
1 m nts in a dev lopm ntal and tissue-sp cific manner. 

6. EXAMPLE 3: ACTIVITY OF ARABIDOPSIS SCR 
PROMOTER IM TRANSGENIC ROOTS 

The expression pattern of Arabidopsis SCR has 
been determined by analysis of an enhancer trap line, ET199, 
in which a GUS coding region with a minimal promoter was 
fortuitously inserted 1 kb upstream of the SCR coding region 
(see supra). In ET199 plants, GUS expression is detected in 
the endodermis, endodermal initials and sometimes in the 
quiescent center (QC) of the root. See supra and Malamy and 
Benfey, 1997, Dev. 124:33-44- This expression pattern of SCR 
in the primary root has been confirmed by in situ analysis 
(See supra and Di Laurenzio et al., 1996, Cell 86:423-433). 

The following experiments demonstrate that 2.5 kb 
of 5' seqpience upstream of the Arabidopsis SCR coding region 
is sufficient to confer SCR expression pattern to a 
heterologous gene. The 5' sequence used in these studies 
starts from the Hind III site approximately 2.5 kb upstream 
of the ATG initiation site and extends 3' downstream to the 
base pair immediately upstream of the ATG initiation site 
(see FIG. 14) . This 5' sequence was fused to a GUS coding 
sequence. The resulting SCR pr omoter : : GUS construct was 
incorporate into an Agrobacterium vector, which was used to 
transform and generate transgenic roots using standard 
procedures • 

A large number of roots were regenerated. They 
show GUS staining pattern that is similar to the SCR 
expression pattern in ET199 plants (Figure 19, Panel f ) . 
Since organs regenerated from callus often have an abnormal 
morphology, transgenic roots were transferred to liquid 
culture. Roots grown in liquid culture appeared 
morphologically normal and showed GUS expression in the 

ndod rmis, endodermal initial and QC (Figur 19, Pan 1 g) , 
similar to the xpression patt rn of SCR s en in th 
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enhancer trap line ET199. These results Indicate that the 
2.5 kb region upstream of the SCR start site is sufficient to 
confer the SCR expression patt rn in th root. 

The expression of the SCR promoter :: GUS construct 
5 was also examined in scr mutant background. The scr mutant 
has an altered root organization (see, supra) . Whereas the 
wild-type root of Arabidopsis has four distinct cell layers 
surrounding the vascular tissue, the roots of scr mutant have 
only three. 

Transgenic roots of the scr mutant were generated 
that contained a SCR promoter :: 6US construct. As in the 
wild-type, a large number of transgenic roots were formed 
that had detectable GUS expression (Figure 20, Panel a). 
These roots were shorter than wild-type regenerated roots, 
15 consistent with the shorter root phenotype of the scr mutant. 

Additional transgenic root experiments 
demonstrated that the SCR gene under control of its own 
promoter can rescue the scr mutant phenotype. Transgenic scr 
roots were generated that contained the full length SCR gene 
20 under the control of its own promoter. The length of 

transgenic roots containing the construct were longer than 
those of the scr mutant, indicating that the introduced SCR 
gene partially rescued the mutant. Whereas scr regenerated 
roots that carried the SCR promoter :: GOS construct were very 
25 short (Figure 21, Panel a; and Figure 20, Panel a), roots 
transformed with the SCR promoter and coding region were 
noticeably longer (Figure 21, Panel b) . The difference was 
even more obvious in liquid culture, in which scr mutant 
roots remained short (Figure 21, Panel c) , while SCR gene 
30 complemented scr mutant roots were long and resembled wild- 
type roots (Figure 21, Panel d) . 

Anatomical studies of the regenerated roots 
confirmed the ability of the SCR promoter :: 5CJ? gene construct 
to rescue the scr mutant phenotype. whereas regenerated 
35 roots Of sex- mutant were missing an internal layer (Figure 
21, Panel e) , the scr mutant roots that wer transformed with 
the SCR promot rzisCR gene construct had a radial 
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organization that res mbled wild-type root (Figure 21, 
Pan 1 f ) . 

9 . EXAKPLE 4 : ISOLATION SCR SEQUENCES .USING PCR- 
gy^MING STRATEGY 

Based on the comparison of the sequences of SCR 

paralogs in Arabidopsis, degenerate primers SCR3AII, SCR5AII 

and SCR5B were designed and used in PGR amplification of SCR 

sequences from genomic DNA of various plant species. The 

amplification was performed according to condition described 

in Section 5.1.1., supra, using DNA isolated from maize 

plants grown from a commercial seed mixture. Amplification 

products (104 bp fragment for the SCR5B+SCR3AII primer 

combination; 146 bp fragment for the SCR5AII+SCR3AII primer 

combination) were obtained, and each cloned into a T/A vector 

(Invitrogen, San Diego, CA) and sequenced. Two of the three 

different types of clones obtained had deduced amino acid 

sequences that were very similar to a part of the Arabidopsis 

SCR protein (i.e., approximately 90% identity), suggesting 

that they represent parts from two different alleles of the 

maize SCR gene (i.e., ZCR gene). The two clones each had 

only two conservative changes in their nucleotide sequence. 

The 146 bp amplification product, ZmScll, was 

subsequently used as a probe for screening of a genomic 

library generated in lambda BlueSTAR vector (NOVAGEN) from 

maize (Hill line) genomic DNA. The screening was performed 

according to the standard procedures described in g^njus" 

system User^s Guid^ For Mei nbrane Hybridization (Boehringer- 

Mannheim) : The probe was a single-strand DNA molecule 

^ corresponding to the ZmScll fragment produced by PCR (Genius, 

Boehringer-Mannheim) . Hybridization was performed according 

to recommendations of the manufacturer's manual 

(Boehringer-Mannheim) . Prehybridization was for 2 hr in 50% 

formamide hybridization solution at 42 •C. Hybridization was 

J overnight at 42 •C with 200 ng/ml probe concentration. 

Pilt rs were washed twic at room temperatur in 2xSSC, 0.1% 
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SDS for 5 min, and for string nt washing at 65 in 
0.5xSSC,0.l% SDS twice for 15 min. 

A positive clone was identifi d. The clone 
contained a 13 kb insert, which was subcloned into a plasmid 
5 vector. The resulting plasaid was designated pZCR. A 5 )cb 
Eco RI fragment containing the maize SCR (ZCR) sequence was 
subcloned and sequenced. The nucleotide sequence of the 
region containing a partial ZCR coding sequence is shown in 
FIG. 17 A and the corresponding deduced amino acid sequence is 

10 shown in FIG. 17B. The ZCR protein contain a segment that is 
highly homologous to a corresponding segment in the 
Arabidopsis SCR protein (PIG. 17B) . This segment is flanked 
by segments of low homology. Thus, it is possible that the 
genomic clone of ZCR is a composite clone, containing 

15 sequences that are not ZCR sequences. 

The deduced ZCR protein secfuence was aligned with 
that of Arabidopsis SCR protein. The comparison revealed new 
conserved sites in the SCR coding sequence which were used to 
design new, more specific PCR primers (i.e., IF, IR, and 4R) 

2 0 for use in amplification of SCR sequences from yet other 
plant species. 

Using combinations of primers IF+IR and 1F+4R, 
PCR amplification were performed as described in section 
5.1.1. . Two DNA of expected size were obtain from soybean: 

25 a 247 bp DNA from the IF+IR primer combination and a 379 bp 
DNA from the 1P+4R primer combination. A DNA of expected 
size (247 kb) was obtained from carrot and spruce when their 
genomic DNA was amplified using 1F+4R primer combination. 
The nucleotide sequences of the 379 kb soybean DNA (SRPgl) , 

30 the 247 kb DNA from carrot {SRPdl) and spruce (SRPpl) are 
shown in FIGS. 16K-M. The corresponding deduced amino acid 
sequences of these amplified sequences are shown in FIG. 18. 
Comparison of these partial SCR coding sequences indicate 
this approach isolated DNA sequences that encode SCR proteins 

35 with amino acid sequences that are very similar but not 

identical to a segment f Arabidopsis SCR protein (see FIG. 
18) . 
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10. EXAMPLE 5. EXPRESSION PATTERN OF MAIZE ZCR GENE 
TN ROOT TISSUE 

Th s experiments examined the expr ssion patt rn 
of ZCR in the primary root and quiescent centers of maize 
root. The expression pattern was determined by in situ 
hybridization using a ZCR RNA probe, corresponding to an 
amino acid segment region that is highly homologous to a 
corresponding segment of the Arabidopsis SCR protein. The 
experiment was carried out as follows. Restriction fragments 
containing the maize ZCR sec[uence were isolated from pZCR and 
subcloned into a pBluescript vector for in vitro 
transcription. The probe was synthesized using conditions 
described in the Genius Dig RNA labeling kit. The 
pBluescript plasmid was linearized, and 1 Mg was used as a 
template to synthesize digoxigenin-labeled RNA using the T7 
polymerase. The RNA probe was subjected to mild alkali 
hydrolysis by heated at 60**C for 1 hr in 100 mM carbonate 
buffer (pH 10.2) to yield a probe size of approximately 0.15 
kb. Probe concentration for hybridization was optimized at 1 
/ig/ml/kb. In situ hybridization of root tips from 48 to 72 
hr-old maize seedlings or excised quiescent centers (QCs) of 
roots were carried out following procedures described in 
Section 6.1.6., supra. 

The results show that ZCR expression in maize 
primary roots is localized to a file of cells that is 
identified as the endodermal layer. The expression pattern 
continues in a single uninterrupted file through the QC which 
consists of approximately 1000-1500 cells (FIG. 22) . 

In two-week old regenerating QCs, ZCR expression 
is foxind in a file of cells extending through the newly 
formed apex. Thus, the regenerated roots exhibits a ZCR 
expression pattern that is similar to that seen in the 
primary root, even though the root apex does not contain the 
normal arrangement of cell files at this stage. 

ZCR expression during regeneration of the root 
apex was als xamined. In the initial stages of 
regeneration, cell proliferation occurs to fill in the 
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removed tissue and begins to r generat the basic shap of 
the root tip. All cells on the blunt dge of the root 
appears to contribute to the new population of c lis. The 
ZCE expression pattern indicates that molecular signals are 
5 differentially present in these cells at an early stage in 
regeneration. The gene appears to be diagnostic of cells 
that are preparing to undergo asymmetrical division in order 
to re-establish the normal organization of the root apex from 
the large undifferentiated cells. The results indicate that 
10 ZCR expression is required for pattern formation since it is 
expressed prior to the generation of any specific anatomical 
pattern in the newly formed not tissue. 

11. EXAMPLE 6. EXPRESSION PATTERN OF ZCP 

15 GENE IN SOYBEAN ROOTS AND ROOT NODULES 

SCR expression in soybean roots and nodules was 
examined using in situ hybridization with a SCR probe. The 
procedure used were as described in Sections 6.1.6. and 11. 

In primary roots, SCR is expressed in the 
endodermis. Expression was also found in cells at the root 
tip that are located at the distal end of the endodermal cell 
files. In soybean nodules, expression of SCR was detected in 
the peripheral tissue at the site of developing vascular 
strands. At later stages of vascular development within the 
25 nodule, SCR expression was found flanking the vascular 
tissue. These results indicate that SCR is involved in 
regulating vascularization in the nodule by contributing to 
the radial organization that is required to generate 
endodermis. These findings indicate that SCR promoter may be 
used to express proteins in a highly tissue-specific manner 
in soybean nodules. One application is to use SCR promoter 
to engineer nodules through production of components in a 
tissue-specific manner. Another application is that 
modification of the expression of SCR could enhance nodule 
35 activity by improving vascularization and/or the number of 
endod rmal layers. 
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12. EXAMPLE 7. SCR EXPRESSION AFFECTS 

GRAVITROP ISM OF AERIAL STRUCTURES 

In addition t being defective in specific 
embryonic and postenbryonic aeristeaatic divisions, both the 
scr and the shr autants have shoots that exhibit severely 
defective gravitropisa. Complementation analysis showed that 
scr is allelic to a sgr (shoot gravitropisa) autant, sgrl. 
Four mutant alleles of SCR (i.e., acrl, scr2 , sgrl^l and 
sgrl-2) have been identified. All four of these autants have 
noraal root gravitropisa and defective shoot gravitropisa. 

Etiolated hypocotyls of scr autants placed on 
their sides do not respond to gravity even after 3 hr. 
Similar behaviors were observed with the inflorescence steas 
of sgr 1-1 autant, which do not cxarve upwards even after two 
days on their sides. In contrast, the roots of these plants 
respond rapidly to the change in orientation with the same 
kinetics as the wild type. Thus, autations in the SCR gene 
lead to a radial pattern deficiency in the root but have no 
effect on root gravitropisa. 

Coaparable results were also obtained for shr 
roots and for hypocotyls and inflorescence steas, i.e., data 
indicate that shr shows noraal root gravitropisa but alaost 
no stea gravitropisa. 

13. DEPOSIT OF MTCRQORGAKISMS 

The following microorganisms have been deposited 
in accordance with the teras of the Budapest Treaty with the 
American Type Culture Collection; 12301 Parlcla%m Drive, 
Rockville, MD 20852, U.S.A., on the dates indicated: 

Accession 

'® f^icrooroani^m Clone Ha. BSta 

DHSa P0BX-2TK* 98031 April 26, 1996 

(pLZG l-3/Sac4-MOBlSac) 

OHSa pNYHl (Zra-«cllb) 98032 April 26, 1996 

DHSa PHYH2 (2m-ecll> 98033 April 26, 1996 

DHSa pMYMS (Zin-Bcl2> 98034 April 26, 1996 

DHSa pSCR ^P'il 18, 1997 
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Although the inv ntion is described in detail 
with raferenc to specific embed im nts th r of, it will be 
understood that variations which are functionally equivalent 
are within the scope of this invention. Indeed, various 
modifications of the invention in addition to those shown and 
described herein will become apparent to those skilled in the 
art from the foregoing description and accompanying drawings 
such modifications are intended to fall within the scope of 
the appended claims. 

Various publications are cited herein, each of 
the disclosures of which is incorporated by reference in its 
entirety. 
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tD GBNBRAL INFORMATIONS 

(i) UPPLICANTt B«nf#y, Phillip N. 

Oi Laurttnsio, Laura 
Wyaocka-Dillar, Joanna 
Malanty^ Jocalyn E. 
Pyah, Laonard 
Halaruitta, Yrjo 

(ii) TITLE OF INVENTION: SCARECROW GENE, PROMOTER AND USES 
THEREOF 

<iii) NUMBER OF SBQUBNCESs 67 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Pannia 6 Bdmonda LLP 

(B) STREETS 1155 Avanua of tha Amaricaa 

(C) CITY: Naw York 

(D) STATE: Naw York 

(E) COUNTRY: USA 

(F) ZIP: 10036-2711 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatibla 

(C) OPERATING SYSTEM: PC-DOS/MS-OOS 

(D) SOFTWARE: PatantIn Ralaase #1.0, Vareion #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/638,617 

(B) FILING DATE: 26-APR-1996 

(viii) ATTORNEY /AGENT INFORMATION: 
(A> NAME: Corucsi, Laura A. 

(B) REGISTRATION NUMBER: 30,742 

(C) REFERENCE /DOCKET NUMBER: 005914-0056-999 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (212) 790-9090 

(B) TELEFAX: (212) 869-9741 

(C) TELEX: 66141 PENNIE 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2163 baaa pairs 

(B) TYPE: nuclaic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (ganonic) 



(xi) SEQUENCE DBSCRIPTI N: SEQ ID NOtl: 
CCTTATTTAT AACCATGCAA TCTCACGACC AACAACCCTT CAATCTCCAT GGCGGAATCC 
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OOCCATTTCA 


AOGGTOGTCA 


ACCTOCTCCT 


CATAGTCX!TC TGAGAACAAC 


1 TTCTTCCGGT 


120 


XGTAGCAGCA 


GCAACAACCG 


TGGTCCTCCT 


CCTCCTCCTC CTCCTCCTTT 


AGTGATGGTG 


160 


AOAAAAAGAT 


TAGCTTOCGA 


GATGTCTTCT 


AACCCTGACT ACAACAACTC 


CTCTOGTCCT' 


240 


CCTOGCCGTG 


TCTCTCACCT 


TCTTGACTCC 


AACTACAATA CTGTGAGACC 


AGAACAAOCA 


300 


CCCTCTCTTA 


OGGOGGOGGC 


TACTGTATCT 


TCTCAACCAA ACCCACCACT 


CTCTCTTTGT 


360 


GGCTTCTCTG 


GTCTTCCOGT 


T TTTCCTTCA 


GACCGTGGTG OTC6GAATGT 


TATGATGTCC 


420 


GTACAACCAA 


TGGATCAAGA 


CTCTTCATCT 


TCTTCTCCTT CACCTACTGT 


ATOGGTTGAC 


480 


GCCATTATCA 


GAGACCTTAT 


CCATTCCTCA 


ACTTGAGTCT CTATTCCTCA 


ACTTATCCAA 


540 


AAOGTTAGAG 


ACATTATCTT 


CCCTTGTAAC 


CCAAATCTCG CTGCTCTTCT 


TGAATACAGG 


600 


CTCCGATCTC 


TCATGCTCCT 


TGATCCTTCC 


TCTTCCTCTG ACCCTTTCTCC 


TCAAACTTTC 


660 


GAACCTCTCT 


ATCAGATCTC 


CAACAATCCT 


TCTCCTCCAC AACAGCAACA 


GCAGCACCAA 


720 


CAACAACAAC 


AAGAGCATAA 


GCCTCCTCCT 


CCTCCGATTC AGCAGCAAGA 


AAGAGAAAAT 


780 


TCTTCTACCG 


ATGCACCACC 


GCAACCAGAG 


ACAGTGAOGG CCACTGTTCC 


CGCCGTCCAA 


840 


ACAAATACGG 


CGGAGCCTTT 


AAGAGAGAGG 


AAGGAAGAGA TTAAGAGGCA 


GAAGCAAGAC 


900 


GAAGAAGGAT 


TACACCTTCT 


CACATTOCTG 


CTACAGTOTO CTGAAGCTGT 


CTCTGCTGAT 


960 


AATCTCGAAG 


AAGCAAACAA 


CCTTCTTCTT 


GAGATCTCTC AGTTATCAAC 


TCCTTACGGG 


1020 


ACCTCAGCGC 


AGAGAGTAGC 


TGCTTACTTC 


TOGGAAGCTA TGTCAGC6AG 


ATTACTCAAC 


1060 


TCCTGTCTCG 


GAATTTACGC 


GGCTTTGCCT 


TCAOGGTGGA TGCCTCAAAC 


GCATAGCTTG 


1140 


AAAATGGTCT 


CTGOGTTTCA 


GGTCTTTAAT 


GGGATAAGCG CTTTAGTGAA 


ATTCTCACAC 


1200 


TTTACAGCGA 


ATCAGGOGAT 


TGAAGAAOCA 


TTTGAGAAAG AAGACAGTGT 


ACACATCATT 


1260 


GACTT6GACA 


TCATGCAGGG 


ACTTCAATGO 


CCTCGTTTAT TCCACATTCT 


TGCTTCTAGA 


1320 


CCTGGAGGAC 


CTCCACAOGT 


GCGACTGACG 


G6ACTTGGTA CTTCCATGGA 


AGCTCTTCAG 


1380 


GCTACAGGGA 


AACGTCTTTC 


GGATTTCACA 


GATAAGCTTG GCCTGCCTTT 


TGAGTTCTGC 


1440 


CCTTTACCTG 


AGAAAGTTGG 


AAACTTOGAC 


ACTGAGAGAC TCAATGTGAG 


GAAAAGGGAA 


1500 


OCTGTGGCTG 


TTCACTCCCT 


TCAACATTCT 


CTTTATGATG TCACTCGCTC 


TGATGCACAC 


1560 


ACTCTCTCGT 


TACTCCAAAG 


GTAAAATAAA 


CATTACCTTT TAATCACTCT 


TTATCTATAA 


1620 


ATTATTTTAA 


GATTATATAG 


GAAAGATATG 


TTCTAAAAAG CTGGCTTTTT 


TGGTTAATGA 


1680 


TTGGGGAATG 


AACAGATTAG 


CTCCTAAAGT 


TGTGACAGTA GTGGAGCAAG 


ATTTGAGCCA 


1740 


OGCTGGTTCT 


TTCTTAGGAA GATTTGTAGA 


GOCAATACAT TACTACTCTG 


CACTCTTTGA 


1600 


CTCACrrOGGA OGAAGCTAOG 


OCOAAGAGAG 


TCAAGAGAGA CATGTOGTGG 


AACAGCAGCT 


1860 


ATTATCGAAA 


GAGATACGGA 


ATGTATTAGC 


GGTTGGAGGA CCATCGAGAA 


GOGGTGAAGT 


1920 


GAAGTTTGAG 


AGCTGGAOOG 


AGAAAATGCA 


AGAATGTGGG TTTAAAGGTA 


TATCTTTAGC 


1980 


TGGAAATGGA OCTACACAAG 


OOACTCTACT 


OTTGGGAATG TTTCCTTOCC 


ATCGTTACAC 


2040 


TTTCGTTOAT QATAATGGTA 


CACTTAAGCT 


TGGATOGAAA GATCTTTCGT 


TACTCACTGC 


2100 
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TTCAGCTTGO ACOCCTCOTT CTTAOTTTTC TTCTCCTTTT TCACAAACAA TGTOCCCATA 



2160 



AAT 



2163 



(2) INFORMATION FOR 8BQ ID N t2t 

(1) SBQUKNCC CKARACnRISTICSs 

<A) LENGTHS 653 amino acids 

(B) TTPBi amino acid 

(C) STRANDBDNSSSx unknown 

(D) TOPOLOGYs unknown 

(ii) MOUCUU TYPBs protain 



(xi) SBQUBNCB OBSCRIPTIONs 6BQ ID NOs2: 

Met Ala Glu Ser Gly Asp Phe Asn Gly Gly Gin Pro Pro Pro Hia Ser 
15 10 15 

Pro I#au Arg Thr Thr Sar Ser Gly Sar 8ar Sar Sar Aan Aan Arg Gly 

20 25 30 

Pro Pro Pro Pro Pro Pro Pro Pro Lieu Val Met Val Arg Lys Arg Leu 
35 40 45 

Ala Ser Glu Met Sar Ser Aan Pro Aap Tyr Aan Aan Ser Ser Arg Pro 

50 55 60 

Pro Arg Arg Val Ser His L«u Leu Aap Ser Aan Tyr Aan Thr Val Thr 
65 70 75 80 

Pro Gin Gin Pro Pro Ser Leu Thr Ala Ala Ala Thr Val Ser Ser Gin 
85 90 95 

Pro Aan Pro Pro Leu Ser Val Cya Gly Phe Ser Gly Leu Pro Val Phe 
100 105 110 

Pro ser Aap Arg Gly Gly Arg Aan Val Met Met Ser Val Gin Pro Met 

115 120 125 

Aap Gin Aap Ser Ser Ser Ser Ser Ala Ser Pro Thr Val Trp Val Aap 
130 135 140 

Ala lie He Arg Aap Leu He Hia Ser Ser Thr Ser Val Ser He Pro 

145 150 155 160 

Gin Leu He Gin Aan Val Arg Aap He He Phe Pro Cya Aan Pro Aan 

165 170 175 

Leu Gly Ala Leu X«eu Glu Tyr Arg Leu Arg Ser Leu Met Leu I.eu Aap 

180 185 190 

Pro Ser Ser Ser Ser Aap Pro Ser Pro Gin Thr Phe Glu Pro Leu Tyr 
195 200 205 

Gin He Ser Aan Aan Pro Ser Pro Pro Gin Gin Gin Gin Gin Hia Gin 

210 215 220 

Gin Gin Gin Gin Gin Hia Lya Pro Pro Pro Pro Pro Ha Gin Gin Gin 
225 230 235 240 

Glu Arg Glu Asn Ser Ser Thr Aap Ala Pro Pr In Pro Glu Thr Val 



245 



250 



255 
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Thr Ala Thr Val Pro Ala Val Oln Thr Asn Thr Ala Glu Ala Vmu Aro 

260 265 270 

lu Arg Lya Glu lu Xla Lya Arg Gin Lya Gin Aap Glu Glu Gly X.au 
275 280 285 

Hia Xisu iMu Thr Lau Leu Leu Gin Cya Ala Glu Ala Val 8er Ala Aap 

290 295 300 

Aan Leu Glu Glu Ala Aan Lya Leu Leu Leu Glu lie Ser Gin Leu Ser 

305 310 315 320 

Thr Pro Tyr Gly Thr 8er Ala Gin Arg Val Ala Ala Tyr Phe Ser Glu 
325 330 335 

Ala Met Ser Ala Arg Leu Leu Aan Ser Cya Leu Gly lie Tyr Ala Ala 
340 345 350 

Leu Pro Ser Arg Trp Met Pro Gin Thr Hia Ser Leu Lya Met Val Ser 

355 360 365 

Ala Phe Gin Val Phe Aan Gly lie Ser Pro Leu Val Lya Phe Ser Hia 

370 375 380 

Phe Thr Ala Aan Gin Ala He Gin Glu Ala Phe Glu Lya Clu Aap Ser 
"5 390 395 400 

Val Hia He He Aap Leu Aap He Met Gin Gly Leu Gin Trp Pro Glv 
405 410 415 

Leu Phe Hia He Leu Ala Ser Arg Pro Gly Gly Pro Pro Kia Val Arq 
420 425 430 

Leu Thr Gly Leu Gly Thr Ser Met Glu Ala Leu Gin Ala Thr Gly Lva 
435 440 445 

Arg Leu Ser Aap Phe Thr Aap Lye Leu Gly Leu Pro Phe Glu Phe Cya 
450 455 460 

Pro Leu Ala Clu Lya Val Gly Aan Leu Aap Thr Glu Arg Leu Aan Val 

470 475 480 

Arg Lya Arg Clu Ala Val Ala Val Hia Trp Leu Gin Hia Ser Leu Tyr 
485 490 495 

Aap Val Thr Gly Ser Aap Ala Hia Thr Leu Trp Leu Leu Gin Arg Leu 

500 505 510 

Ala Pro Lya Val Val Thr Val Val Glu Gin Aap Leu Ser Hia Ala Glv 

515 520 525 

Ser Phe Leu Gly Arg Phe Val Glu Ala He Hia Tyr Tyr Ser Ala Leu 
530 535 540 

Phe Aap Ser Leu cly Ala Ser Tyr Gly Glu Glu Ser Glu Glu Arg Hia 

550 555 560 

Val Val Glu Gin Gin Leu Leu Ser Lya Glu He Arg Aan Val Leu Ala 
565 570 575 

Val Gly Cly Pro Ser Arg Ser Gly Glu Val Lya Phe Glu ser Trp Aro 
580 585 590 

Clu Lya Met Gin Gin Cya Gly Ph Lya Gly He Ser Leu Ala Gly Aan 
595 600 605 

Ala Ala Thr Gin Ala Thr Leu Leu Leu Gly Met Phe Pro Ser Aap cly 
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610 615 620 

Tyr Thr Leu Val Asp Asp Aan Gly Thr Leu Lye Leu Gly Trp Lye Aep 
625 630 635 640 

Leu Ser Leu Leu Thr Ale Ser Ala Trp Thr Pro Arg Ser 
645 650 

(2) INFORMATION FOR 8BQ ID NO: 3s 

(i) SBQUBNCB CHARACTERISTICS s 

(A) LENGTH t 23 amino acids 

(B) TYPE I amino acid 

(C) STRANDBDNESSi unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SBQUBNCB DESCRIPTION: SEQ ID NO: 3: 

Pro Ala Val Gin Thr Asn Thr Ala Glu Ala Leu Arg Glu Arg Lys Glu 
15 10 15 

Glu lie Lys Arg Gin Lye Gin 
20 

(2) INFORMATION FOR SBQ ID NO: 4: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDBDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Leu Lys Arg Ala Arg Asn Thr Glu Ala Ala Arg Arg Ser Arg Ala Arg 

1 5 10 15 

Lye Leu Gin Arg Met Lys Gin 
20 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDBDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SBQ ID NO:5: 

Arg Arg Leu Ala Gin Asn Arg Glu Ala Ala Arg Lys Ser Arg Leu Arg 

15 10 15 
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Ly« Ly» Ala Tyr VaI Oln OIn 

20 

(2) XNPORMATZ N FOR SBQ ID NOs6s 

(1) SBQUKNCS CHARACTBRXSTZCSt 

(A) UNOTHt 23 amino acids 

(B) TYPBs amino acid 

(C) STRANOBONBSB : unknown 

(D) TOPOLOGY I unkncMm 

(ii) KOLBCULE TYPBt peptida 



(xi) SBWBNCE DESCRIPTION: SBQ ID NO: 6s 

IXa Arg Arg Glu Arg Aan Lys Mat Ala Ala Ala Lys Cys Ara Asn Aro 

^ 5 10 15 

Arg Arg Olu Lau Thr Asp Thr 

20 

(2) INFORMATION POR SBQ ID NOi7: 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) 5TRANDEDNBSS: unknom 

(D) TOPOLOGY: unknoim 

(ii) MOLECULE TYPE: peptide 



(xi) SBQUBNCE DESCRIPTION: SBQ ID NO: 7: 

Arg Lys Arg Met Arg Asn Arg He Ala Ala Ser Lys Cys Arg Lys Arg 

^5 10 15 

Lys Leu Glu Arg He Ala Arg 

20 

(2) INFORMATION FOR SBQ ID NOsS: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEONBSS: unknovn 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCB DESCRIPTION: SBQ ID NO: 8: 

Val Arg Leu Met Lys Asn Arg Glu Als Ala Arg Glu Cys Arg Arg Lys 

* 5 10 15 



Lys Lys lu Tyr Val Lys Cys 

20 

(2) INFORMATI N FOR SBQ ID NO: 9: 
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(1) SBQUBNCE CHARACTSRXSTZCSs 

(A) LKNGTHt 23 amino acids 

(B) TYPES anin acid 

(C) STRANDBDNXSSt unknown 

(D) TOPOLOGYt unknoim 

(ii) MOUCULE TYPBt paptid* 



(xi) SBQ9BNCB DESCRIPTZONs SBQ ID NOs9i 

Lys KcQ Lya Glu Bar Aan Xrg Glu Ser Ala Arg Arg Sar Arg Tyr Arg 

15 10 15 

Lya Ala Ala Hia Lau Lya Olu 

20 

(2) ZNPORMATZON POR SBQ ID NOtlOs 

(1) SBQUBNCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acide 

(B) TYPE: amino acid 

(C) STRANDEDNESSt unknown 

(D) TOPOLOGY: unknom 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Mat Arg Gin Ila Arg Aan Arg Aap Sar Ala Met Lya Ser Arg Glu Arg 

15 10 IS 

Lya Lys Ser Tyr lie Lya Aap 

20 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNE5S: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 11: 

Arg Arg Met Val Ser Aan Arg Glu Ser Ala Arg Arg Ser Arg Lya Lya 
15 10 15 

Lya Gin Ala Hia Leu Ala Asp 

20 

(2) INrORMATI€»l FOR SBQ ID NO: 12: 

(i) SBQUBNCE CHARACTERISTICS: 

(A) LBN TH: 43 amino acida 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
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(D) TOPOMGYt unknoim 
(11) MOLECULE TYPE: peptide 



(xl) SEQUENCE DESCRIPTION: SEQ ID NOtl2: 

Ala Phe Glu Lys Glu Aep Ser Val His He He Asp Leu Aap He Met 

1 S 10 *^ 15 

Cln Cly Leu Gin Trp Pro Oly Leu Phe Hie He Leu Ala Ser Ara Pro 
20 25 30 

Cly Cly Pro Pro Hie Val Arg Leu Thr Cly Leu 
35 40 

(2) INFORMATION POR SEQ ID NOsl3: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino aclda 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknoim 

(11) MOLECULE TYPE: peptide 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Ala Val Lya Aan Glu Ser Phe Val HIb He He Aap Phe Gin He Ser 

1 5 10 15 

Gin Gly Gly Gin Trp Val Ser Leu He Arg Ala I^u Gly Ala Ara Pro 
20 25 30 

Gly Gly Pro Pro Aan Val Arg He Thr Gly He 
35 40 

(2) INFORMATION FOR SEQ ID NO: 14: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino aclda 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(11) MOLECULE TYPE: peptide 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Ala Met Glu Gly Glu Ly« Met Val Hla Val He Asp Leu Aap Ala Ser 

^5 10 15 

Glu Pro Ala Gin Trp Leu Ala Leu Leu Gin Ala Phe Aan Ser Arg Pro 
20 25 30 

Glu Gly Pro Pr Hla Leu Arg He Thr Oly Val 
35 40 

(2) INFORMATION FOR SEQ ID NO: 15: 
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(i) 8SQUKNCB CHARACTMISTZCS: 

(A> Z.BNGTH: 29 Amino acide 

(B) TYPBt amino acid 

(C) STRANDBDNKSS: unknown 

(D) TOPOUXSYs unknoim 

(ii) HOLECUU TYPBt paptido 



(Xi) SBQUBNCB DB5CRXPTIC»ls SBQ ID NOtlS: 

Ala I la Lya Gly OIu Glu Glu Val His lie Ila Aap Pha Aap Zla Aan 

15 10 15 

Gin Gly Aan Gin Tyr Met Thr Leu Ila Arg Ser Ila Ala 

20 25 

(2) INFORMATION FOR SBQ ID NOsl6: 

(i) SBQUBNCB CHARACTBRISTICS : 

(A) LBNGTKt 26 amino acida 

(B) TYPBi amino acid 

(C) STRAMOBDNBSSs unknown 

(D) TOPOLOGY} unkno%m 

(ii) MOLECUI^ TYPB: peptide 



(xi) SBQUBNCB DBSCRIPTION: SBQ ID NO: 16: 

lie Hia Val lie Amp Phe Xaa Leu Gly Val Gly Gly Gin Trp Ala Ser 

15 10 15 

Phe Leu Gin Glu Leu Ala Hia Arg Arg Gly 

20 25 

(2) INFORMATION FOR SBQ ID NO: 17 s 

(i) SBQUBNCB CHARACTBRISTICS : 

(A) LENGTH: 36 amino acida 

(B) TYPB: amino acid 

(C) STRANDBDNBSS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLBCULB TYPB: peptide 



(xi) SBQUBNCB DBSCRIPTION: SBQ ID NO: 17: 

Val Hia lie He Xaa Phe Xaa Leu Met Gin Gly Leu Gin Trp Pro Ala 

15 10 15 

Leu Met Aap Val Phe Ser Ala Arg Lya Gly Gly Pi Pro Lya Leu Arg 

20 25 30 

He Thr Gly He 
35 

(2) INFORMATION FOR SBQ ID NO:18i 
(1) SBQUBNCB CHARACTBRISTICS: 
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(A) LENGTHS 1085 base palrfl 

(B) TYPES nuclttic a id 
(C> STRANDEDNESS s unknown 
(D) TOPOLOGY: unknown 

(ii) KOLBCULB TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTIONS SEQ ID NOilSi 
G6CACGAGCC CAACCC6TCC TOAOCTTCTT ACTTATATCC ATATCTTOTA 
CCTTATTTCA AATTOOGTTA TGAATCTGCT AATCOACCTA TAGCTCAACC 
CAAAGTTTTG TOCACATTAT CGATTTCCAG ATTTCTCAAC GTCCTCAATO 
ATCCGTGCTC TTOGTGCTAG ACCTGOTGGA CCTCCOAACG TTAOGATAAC 
CATCC6AGAT CATCCTTTGC TCCTCAAGGA GOACTTOAGT TACTTCOACA 
AAGCTAGCTG AAATGTGCGG TCTTCCGTTT CACTTCCATO GACCTGCTTT 
GAAGTOGAAA TCGAGAAGCT AGCAGTTACA AATGGAGAAG CCCTCOCGGT 
CTTGTTCTTC ACCACATOCC TGAT6ACAGT CTAACTGTCO AQAATCACAG 
TTGACATTGC TCAAACACTT CTCACCAAAC GTTCTCACTC TOGTTCACCA 
ACAAACACTG CGCCGTTTCT TCCCCGGTTT GTOGAGACAA TGAACCATTA 
TTCCAATCAA TAGATGTGAA ACTCCCTAGA GATCACAACC AAAGGATCAA 
CATTGTTTCG CTACACAGCT TGTGAATCTT ATACCTTOTO AACGTCTTGA 
AGCCACGAGC CACTAOCGAA ATOCAGOTCT CCCTTTCACA TCGCGOCATT 
CCTTTGACCT COTATGTCAA CCCAACAATC AAAGCATTCC TTGACAGTTA 
TATACACTTO AAGAAAGAGA TGCAGCATTO TATTTACCAT GCAAGAATCA 
ACTTCTTGTO CTTCCACCTA ACTAATAAAA ACCTTGTTCC GTTTCAGAAO 
CTTCTTTTAA AGTTTGCACA ATCTCTTTGT AAAAGTAAAA CTCATCCATG 
ACAACTTGTC AAATGTTGTA GTAGTAACTC ATATCTTGAT GACCCAAAAA 
AAAAA 

(2) INPORHATION FOR SEQ ID NOil9: 

(i) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 306 amino aclda 

(B) TYPES amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(li) MOLECULE TYPES protain 



TGAAGCCTGC 
TGTCAAGAAC 
GGTOAGTTTG 
GOGAATTGAT 
AAGACTTGGG 
ATGCTGCACG 
TAACTTCC06 
AGATAGATTG 
AGAAGCGAAT 
CTTGGCAGTT 
TOTTGAGCAG 
AAGAGAAGAG 
TAAACC6TAT 
TTCAGAGAAG 
ACCTCTTATC 
AGATTAGAAA 
ATCCGNAGGA 
AAAAAAAAAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1085 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NO: 19: 

i Thr Tyr Met His 

15 



Oly Thr Sar Pr Thr Gly Pro lu Lau Lau Thr Tyr Mat Hia Ila Lau 
a 10 
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Tyr lu Ala Cy» Pr Tyr Ph Vym Phe Cly Tyr Olu S«r Ala Aan Cly 

20 25 30 

Ala !!• Ala Clu Ala Val Lya Aen Clu Sar Phe Val Hia Ila He Aap 
35 40 45 

Phe Oln Zle 8er Oln Oly Oly Oln Trp Val Ser Leu He Arg Ala Leu 

SO 55 60 

Oly Ala Arg Pro Cly Oly Pro Pro Aan Val Arg He Thr Cly He Aep 
65 70 75 80 

Aep Pro Arg Ser Ser Phe Ala Arg Oln Oly Gly Leu Olu Leu Val Oly 

*^ 85 90 95 

Oln Arg Leu Oly Lye Leu Ala Glu Met Cy« Oly Val Pro Phe Glu Phe 

100 105 110 

Hia Gly Ala Ala Leu Phe Cya Thr Glu Val Glu He Glu Lya Leu Cly 

115 120 125 

Val Arg Aen oly Glu Ala Leu Ala Val Aan Phe Pro Leu Val Leu Hia 

130 135 140 

Hia Mot Pro Aap Glu Ser Val Thr Val Glu Aan Hia Arg Aap Arg Leu 

145 150 155 160 

Leu Ara Leu Val Lya His I*eu Ser Pro Aan Val Val Thr Leu Val Clu 
165 170 175 

Cln Olu Ala Aan Thr Aan Thr Ala Pro Phe Leu Pro Arg Ph« Val Clu 
180 185 190 

Thr Met Aen Hia Tyr Leu Ala Val Phe Clu Ser He Aap Val Lya Leu 
195 200 205 

Ala Arg Aap Hia Lya Clu Arg He Aan Val Clu Cln Hia Cya Leu Ala 

210 215 220 

Arg Clu Val Glu Aan Leu He Ala Cya Glu Cly Val Clu Arg Clu Clu 
225 230 235 240 

Arg Hie Glu Pro Leu Gly Lya Trp Arg Ser Arg Phe Hia Met Ala Cly 

250 255 



245 



Phe Lya Pro Tyr Pro Leu Ser Ser Tyr Val Aan Ala Thr He Lya Cly 
^ 260 265 270 

Leu Leu Glu Ser Tyr Ser Clu Lya Tyr Thr Leu Clu Clu Arg Aep Cly 
275 280 285 

Ala Leu Tyr Leu Cly Trp Lya Aan Cln Pro Leu He Thr Ser Cya Ala 

290 295 300 

Trp Arg 

305 

(2) INFORMATION POR SBQ ID NOi20s 

(i) SEQUBNCE CHARACTERISTICS I 

(A) LENGTH! 1231 baae pair a 

(B) TYPES nucleic acid 

(C) STRANOEDNBSSs unknom 

(D) TOPOLOGY! unknown 

(ii) MOLECULE TYPE: cOMA 
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(xi) SEQUKNCE DBSCRIPTXONt SBQ ID MOi20s 



GCTATOGIkAO 


OAOAGAAOAT 


60TTCAT0T0 


ATTOATCTOO 


ATCCTTCTOA 


occAGc?ra»A 


SO 


TCGCTTCCTT 


TGCTTCAAGG 


TTTTAACTCT 


AOGCCTOAAO 


GTCCACCTCA 


TTTOAGAATC 


120 


ACTGOTGTTC 


ATCACCAGAA 


GOAAGTGCTT 


GAACAAATGG 


CTCATAGACT 


CATTGAGGAA 


180 


OCA6AGAAAC 


TCOATATCC5C 


OTTTGAOTTT 


AATCCCGTTG 


TGAOTAGGTT 


AGACTGTTTA 


240 


AATOTAGAAC 


AGTTGCXKMST TAAAACAOOA CAG6CCTTAG 


COGTTAGCTC 


GGTTCTTCAA 


300 


TTOCATACCT 


TCTTGGCCTC 


TOATGATGAT 


CTCATOAGAA 


AGAACTGCGC 


TTTAOOGTTT 


360 


CA6AACAACC 


CTAGTGGAGT 


TGACTTGCAG 


AGAGTTCTAA 


TGATOAGCCA 


TGGCTCTGCA 


420 


6CT6AGGCAC 


GTGAGAATGA 


TATGAGTAAC 


AACAATGGGT 


ATAGCCCTAG 


CGGTGACTOG 


460 


CCCTCATCTT 


TGCCTTTACC 


AAGTTCAOGA 


AGGACTGATA 


GCTTCCTCAA 


TGCTATTTGG 


540 


CGTTTCTCTC 


CAAAGGTCAT 


GGTGGTCACT 


GACCAAGACT 


CAGACCACAA 


CGGCTCCACA 


600 


CTAATGGAGA GGCTATTAGA ATCACTTTAC 


ACCTACGCAG 


CATTGTTTGA 


TTGCTTGGAA 


660 


ACAAAAGTTC 


CAAGAACGTC 


TCAAGATAGG 


ATCAAAGTGG 


AGAAGATGCT 


CTTCGGGGAG 


720 


GAGATCAAGA 


ACATCATATC 


CTGOGAGGGA 


TTTGAGAGAA 


GAGAAAGACA 


CGAGAAGCTT 


760 


GAGAAATGGA 


GCCAGAGGAT 


CGATTTGGCT 


GGTTTTGGGA 


ATGTTCCTCT 


TAGCTATTAT 


840 


GOGATGTTGC 


AGGCTAG6AG 


ATTGCTTCAA 


GGGTGCGGTT 


TTGATGGGTA 


TAGAATCAAG 


900 


GAAGAGAGCG 


GGTGCGCAGT 


AATTTGCTCG 


CAAGATCGAC 


CTCTATACTC 


GGTATCAGCT 


960 


TGGAGATGCA 


GGAAGTGAAT 


GATATATTAC 


AOTTTGTCTT 


CTATTTTGGT 


TATGAGCAGA 


1020 


GTCCCTTTCT 


TTTTTOTATA 


CATGGGGACA 


CAATCTTAGT 


TGTTTTCTGA 


TGGTGACTTT 


1080 


CTGTCTCTTT 


ATGCTATTTT 


GOCTTAAATG 


CTTCTACTGC 


CTCTGCATGT 


AAAGCCTTTG 


1140 


TGTCTTGGTT 


CAATTTGCTC 


TGGTGTGGGT 


GTAATACCAA 


ACCAAATCCA 


ATTTGAGCTG 


1200 


AAGATAACTA 


ATTTQATGAT 


OOGCTCGTGC 


C 






1231 


(2) INFORMATION FOR SEQ ID NO:21: 











(i) SBQOKNCB CHARACTERISTICS s 

(A) LENGTH: 325 amino acids 

(B) TYPE: amino acid 

CC> STRANDEDNBSSt unknown 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPEt protein 



(xi) SBQUENCE DESCRIPTION i SEQ ID NO: 21: 

Ala Met Glu Gly Glu Lya Met Val His Val He Asp Leu Aap Ala Ser 

^5 10 15 

lu Pro Ala Gin Trp Leu Ala Leu Leu Gin Ala Phe Aan Ser Ara Pr 

20 25 30 
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lu Gly Pro Pro HLm Lau Arg lie Thr Gly Val Hifl His Gin Ly« Glu 
35 40 45 

Val Vmu Glu Gin Mat Ala His Arg Lau Xla Glu lu Ala Glu Lye Lau 
50 S5 60 

Asp Xla Pro Pha Oln Pha Aan Pro Val Val Sar Arg Lau Aap Cya Lau 

65 70 75 80 

Aan val Glu Gin Lau Arg Val Lya Thr Gly Glu Ala Lau Ala Val Sar 
85 90 95 

Sar Val Lau Gin Lau His Thr Pha Lau Ala Sar Aap Aap Aap Lau Mat 

100 105 110 

Arg Lya Aan Cya Ala I«au Arg Pha Hla Aan Aan Pro Sar Gly Val Aep 

115 120 125 

Lau Gin Arg Val Lau Mat Mat Sar His Gly Sar Ala Ala Glu Ala Arg 

130 135 140 

Glu Asn Aap Met Ser Aan Aan Aen Gly Tyr Ser Pro Ser Gly Asp Ser 
145 150 155 160 

Ala Ser Ser I«eu Pro Leu Pro Ser Ser Gly Arg Thr Asp Ser Phe Lau 

165 170 175 

Asn Ala He Trp Gly Lau Ser Pro Lya Val Met Val Val Thr Glu Gin 
160 185 190 

Aap Ser Asp His Asn Gly Ser Thr Leu Met Glu Arg Leu Leu Glu Ser 
195 200 205 

Leu Tyr Thr Tyr Ala Ala Leu Phe Aep Cya Leu Glu Thr Lye Val Pro 
210 215 220 

Arg Thr Ser Gin Asp Arg He Lys Val Glu Lys Met Leu Phe Gly Glu 
225 230 235 240 

Glu He Lys Aan He He Ser Cys Glu Gly Phe Glu Arg Arg Glu Arg 
245 250 255 

His Glu Lys Leu Glu Lya Trp Ser Gin Arg He Aap Leu Ala Gly Phe 

260 265 270 

Gly Aan Val Pro Leu Ser Tyr Tyr Ala Mat Leu Gin Ala Arg Arg Leu 
275 280 285 

Leu Gin Gly Cys Gly Phe Aep Gly Tyr Arg He Lys Glu Glu Ser Gly 
290 295 300 

Cys Ala Val He Cya Trp Gin Aap Arg Pro Leu Tyr Ser Val Ser Ala 

305 310 315 320 

Trp Arg Cya Arg Lya 

325 

(2) INFORMATION FOR SBQ 10 MOi22: 

(i) SEQUENCE CHARACTERISTICS 2 

(A> LBNGTHt 1368 base paira 

(B) TYPEt nucleic acid 

(C) STRANDEDNESSs unknown 

(D) TOPOLOGY! unknown 

(ii) MOLECULE TYPES CDNA 
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(xl) SEQUENCE DESCRIPTXCHIs SSQ ZD NOs22: 
CTTTOTCAAT OGTAAATGAO CTQAGOCAOA TAOTTTCTAT CGAAGGAGAC 
OAATCX2CAGC TTACATGOTG GAAOGTCTAjS CTGCAAOAAT OGCOGCTTCA 
TCTACAGAOC ATTGAAATGC AAAOAOCCTC CTTOGGATGA GAGGCTTGCA 
TCCTOTTTOA AGTCTCCCCT TGTTTCAAGT TOCCGTTTTT ACCACCTAAT 
TTOAAGCAAT CAAAGOTGAA GAAOAAGTTC ACATAATCGA TTTOGATATA 

accaata€:at gacactgata ogaagcatto ctoagttgcc tcotaaaoga 

GGTTAACAGG AATTOAT6AC CCTGAATCAG TCCAAC»CTC CATTGGAGGG 

TOCGTCTAAG ACTCGAOCAA CTCCCAGAGG ATAATCOAGT ATCCTTCAAA 

TGCCTTCAAA GACTTOQATT CTCTCTCCAT GAACACTCGG TTGCAAACCA 

TAATA6TGAA CTTTCCATTC CAACTTCACC ACATGCCTGA CCAGACTGTC 

ACCAGCGGGA CGACCTACTT CAGATOGTCA AAAGCTTAAA CCCAAAGCTT 

TTGAACAAGA CGTGAAGACA AACACTTCAC CCTTCTTTCC CAGATTCATA 

AATACTACTC AGCACTTTTC CACTCTCTAO ACATCACACT TCCAAGAGAA 

GGATCAATCT AGAAAGACAG TGTCTCGCTA GAGAGATAGT CAACATTCTT 

GAOAAGAACG GATAGA6AGA TACGAGGCTG OGGGAAAATG GAGAGCAAGG 

CTCGATTCAA TCCAAAACCA ATGACTCCTA AACTAACCAA CAATATACAA 

AGCAACAATA TTGCAATAAO TACAAGCTTA AAGAAQAAAT GGGTGAGCTC 

GGGAGGAGAA AAGCTTAATC GTTCCTTCAG CTTGGAGGTA AGATAAGTGA 

TAGTCTTTAT GTTTCATAAA ACATAATTAT GTTTTTACTG TAATCTTGGG 

ACTGGTTAAA TCATCTCCAT GTATTATTAC CAGAOGTTAG GGCTGATCAC 

AGCTAATCTA ACACTTATOG AAGAATTTTT CTTTCTTTTT TTTCCCTATT 

AATTAGAGTT TTGCTTCTAA ACCTATTTGC TAAGTGTCAA TGACTCTTTA 

TTTCACTTCA AATGGTTAAA TTTGTTAAOG TTCTCACTTA AAAAAAAA 

(2> INPORMATZON POR SSQ ID NOt23i 

(i) SEQUENCE CHARACTERISTICS s 

<A) UNOTH: 351 amino acida 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknom 

(D) TOPOLOGY s unknown 

(ii) MOLECULE TYPES peptide 



CCTTCTCAGA 
GGAAAATTCA 
GCTATGCAAG 
GGTGCGATAC 
AACCAAGGGA 
CCTCGCCTGA 
CTAAGAATCA 
TTCAAAGCAA 
GGAGAAACCT 
ACAACAGTAA 
GTCACGGTCG 
GAGGCTTACG 
AGCCAAGAGA 
GCTTGCGAAG 
ATGATGATGG 
AACCTGATAA 
CATTTTTGCT 
CAAGAGCATA 
TTATTGTGTA 
AGGTACTAAA 
ATATAAAAAT 
CATCTTCATA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1368 



(Xi) SEQUENCE DESCRIPTZCms 8EQ ID KOi23j 

smr H t V«l Aan Glu hmu Arg Gin llm Val Sar He In Oly Aap 
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Pro Ser Gin Arg lie Ala Ala Tyr Met Val Glu Gly Irou Ala Ala Arg 
20 25 30 

Met Ala Ala Sar Gly Lya Pha Zla Tyr Arg Ala Leu Lya Cya Lya Glu 
35 40 45 

Pro Pro Sar Asp Glu Arg LeU Ala Ala Mat Gin Val Leu Phe Glu Val 

50 55 60 

Cya Pro Cya Phe Lya Phe Gly Phe Leu Ala Ala Asn Gly Ala lie Leu 

65 70 75 80 

Glu Ala lie Lya Gly Glu Glu Glu Val Hia He He Aap Phe Aap He 
85 90 95 

Aan Gin Gly Aan Gin Tyr Met Thr Leu He Arg Ser He Ala Glu Leu 
100 105 110 

Pro Gly Lya Arg Pro Arg Leu Arg Leu Thr Gly He Aap Aep Pro Glu 

115 120 125 

Ser Val Gin Arg Ser He Gly Gly I#eu Arg He He Aan Leu Arg Leu 
130 135 140 

Glu Gin Leu Ala Glu Aap Aan Gly Val Ser Phe Lya Phe Lya Ala Met 
145 ISO 155 160 

Pro Ser Lya Thr Ser He Val Ser Pro Ser Thr Leu Gly Cya Lya Pro 
165 170 175 

Gly Glu Thr Leu He Val Aan Phe Ala Phe Gin Leu Hla Hla Met Pro 
180 185 190 

Aap Glu Ser Val Thr Thr Val Aan Gin Arg Asp Glu Leu Leu Hie Met 
195 200 205 

Val Lye Ser Leu Aan Pro Leu Val Thr Val Val Glu Gin Asp Val Aan 

210 215 220 

Thr Aan Thr Ser Pro Phe Phe Pro Arg Phe He Glu Ala Tyr Glu Tyr 
225 230 235 240 

Tyr Ser Ala Val Phe Glu Ser Z«eu Aap Met Thr Leu Pro Arg Glu Ser 
245 250 255 

Gin Glu Arg Met Aen Val Glu Arg Gin Cya Leu Ala Arg Aap He Val 
260 265 270 

Aan He Val Ala Cya Glu Gly Glu Glu Arg He Glu Arg Tyr Glu Ala 

275 280 285 

Ala Gly Lya Trp Arg Ala Arg Met Met Met Ala Gly Phe Aan Pro Lya 

290 * 295 300 

Pro Met Ser Ala Lya Val Thr Aan Aan He Gin Aan Leu He Lye Gin 

305 310 315 320 

Gin Tyr Cya Aan Lya Tyr Lya Leu Lya Glu Glu Met Gly Glu Leu Hia 
325 330 335 

Phe Cya Trp Glu Glu Lya Ser Leu He Val Ala Ser Ala Trp Arg 
340 345 350 

(2) INFORMATION FOR SBQ ID NDs24t 

(i) SSQUSMCB CHARACTERISTICS: 

(A) LSNGTH: 100 baae paira 
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(B) xypBt nucleic acid 

(C) STRANDBDNBSS : eingle 

(D) TOPOLOOYi unknoim 

(ii) XOLBCULB TTPBt cDMA 



(xi) SEQUBNC8 DBSCRZPTI(»t: SBQ ZD KOs24: 
CCAGCAOCCO TTCOAOCCCC AOCACCOTOT OCACATCATC 6ACCT0GACA TCATCCAOGC 
GCTGCAOTGG OOGGGCCTCC TCGACATCCT TCCCTCCCCC 
(2) INFORMATION FOR SBQ ID NOs25t 

(i) SBQUBNCE CHARACTBRISTICS : 

(A) LBNOTHt 33 amino acida 

(B) TYPBx amino acid 

(C) STRANDBDNBSS: unknotm 

(D) TOPOLOGY t unknotm 

(ii) MOLBCULB TYPBt peptide 



(xi) SBQUENCB DESCRIPTION: SEQ ID NO: 25: 

Gin Glu Ala Phe Glu Arg Glu Glu Arg Val His He He Aep Leu Aep 
^ 5 10 IS 

He Met Gin Gly Leu Gin Trp Pro Gly Leu Phe Hie He Leu Ala Ser 
2° 25 30 

Arg 

(2) INFORMATIMI FOR SBQ ID NOr26: 

(i) SBQUBNCE CHARACTERISTICS: 

(A) LBNGTH: 1094 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE5S : unkno%m 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 



(xi) SBQXIBNCE DESCRIPTION: SEQ ID NO: 26: 



CCACCOGTCC GTCAAAGGAT 


ACAACCATCT 


ACACATAATT CACTTTTCCC 


TGAT6CAAGG 


60 


TCTCCACTGG OCGGCACTCA TGGATCTCTT 


CTCCGCCOGT GAGGGTGGGC 


CACCAAAGCT 


120 


CCGAATCACA GGCATTOGCC 


CGAACCCAAT 


AGGTOOCCGT GAOGAGCTCC 


ATGAAGTGGG 


180 


AATTOGCCTC GCCAAGTATG 


CACACTCGGT 


OGGTATOOAC TTCACTTTCC 


AGGGAGTCTG 


240 


T6TCGATCAA CTTOATAGGT 


TCTGCGACTO 


GATGCTTCTC AAACCAATCA 


AAGGAGAGOC 


300 


AGTTGCCATA AACTCCATCC 


TACAACTCCA 


TCC^CCTCCTC GTTOACCCAG 


ATGCAAACCC 


360 


AGTGGTGCCC GCACCAATAG 


ATATCCTCCT CAAATTGOTC ATCAAGATAA 


ACCCCATGAT 


420 
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CTTCACGGTG 


GTTGAGCATG 


AGGCAGATCA 


CAACAGACCA 


CCACTACTAG 


AGAGGTTCAC 


480 


TAATGCCCTC 


TTCCACTATG 


OGACCATGTT 


TGACTCTTTG 


AGGCCATGC 


ATOGTTGTAC 


540 


CAGTGGTAGJi 


GACATCACCG 


ACTCACTCAC 


ACAGGTGTAC 


CTTOOAGGTG 


AGATTTTTGA 


600 


CATTOTCTGC 


OGOGAOOGCA 


GTOGAOOCAC 


OOAAOGTCAT 


OAOTTGTTTO 


OTCACTOGAG 


660 


GGAGAGGCTC 


ACCTATGCTO 


OGCTAACTCA 


AGTGTGGTTC 


GACCCCOATO 


AOGTTGACAC 


720 


GGTAAAAOAC 


CAGTTGATCC 


ATGTGACATC 


CTTATCTOGC 


TCTGOGTTCA 


ACATCCTAGT 


780 


GTGTGATGGC 


AGCCTTOCAC 


TAGOGTGGCA 


TAATCGCCCG 


TTATATGTGG 


CAACAGCTTG 


840 


GTGTGTGACA 


GGAGGAAATG 


CTGOCAGTTC 


CATGGTTGGC 


AACATCTGTA 


AGGGTAGAAA 


900 


TGATAGTAGA 


AGAAAGGAAA 


ACOOTAATGG 


ACCCATGGAG 


TAGCAOOAA6 


AATAACGATG 


960 


TCATGAGCAA 


ATOGATCAAG 


TAATAAAATG 


CACTGATGAC 


ATGCATGGTO 


ATCTAAAGTT 


1020 


TTTT TGCGTG 


AATGTGCAAT 


GACGAATTCT 


TCAATTTCAA 


TAACCTAATC 


ATGAGACTCA 


1080 


AAAAAAAAAA 


AAAA 










1094 


(2) INrORMATION FOR SBQ ID NO: 27 


s 









(1) SEQUENCE CHARACTERISTICS s 

(A) LENGTH: 313 amino acide 

(B) TYPE: amino acid 

(C) STRANDBDNESS: unknovm 

(D) TOPOLOGY: unknown 

<11) MOLECULE TYPE: protein 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO:27: 

His Ala Ser Val Lya Gly Tyr Aen HIb Val His lie lis Asp Phe Ser 

15 10 15 

L0U Met Gin Gly Leu Gin Trp Pro Ala Leu Met Asp Val Phe Ser Ala 
20 25 30 

Arg Glu Gly Gly Pro Pro Lys Leu Arg lie Thr Gly He Gly Pro Asn 
35 40 45 

Pro He Gly Gly Arg Asp Glu Leu His Glu Val Gly He Arg Leu Ala 
50 55 60 

Lys Tyr Ala His Ser Val Gly He Asp Phe Thr Phe Gin Gly Val Cys 

65 70 75 80 

Val Asp Gin Leu Aap Arg X«eu Cye Asp Trp Met Leu Leu Lys Pro He 

85 90 95 

Lys Gly Glu Ala Val Ala He Asn Ser He Leu Gin Leu His Arg Leu 

100 105 1X0 

Leu val Asp Pro Asp Ala Asn Pro Val Val Pro Ala Pro He Asp He 

115 120 125 

L u Leu Lye Leu Val He Lys He Asn Pro Met He Phe Thr Val Val 

130 135 140 

Glu Hie Glu Ala Asp His Asn Arg Pro Pr Leu Leu Glu Arg Phe Thr 
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145 150 155 160 

Attn Ala Phm His Tyr Ala Thr Met Phe Asp Ser Lau Glu Ala Mat 

165 170 175 

Hia Arg Cya Thr 6ar Gly Arg Aap Zla Thr Aap Sar Leu Thr Glu Val 
180 185 190 

Tyr Lau Arg Gly Glu I la Pha Asp Zla Val Cya Gly Glu Gly Sar Ala 

195 200 205 

Arg Thr Glu Arg Hia Glu Lau Pha Gly Kia Trp Arg Glu Ara Ziau Thr 
210 215 220 

Tyr Ala Gly Lau Thr Gin Val Trp Pha Aap Pro Aap Glu Val Aap Thr 

22S 230 235 240 

Lau Lya Aap Gin Lau Zla Hia Val Thr Sar Lau Ser Gly Sar Gly Pha 

245 250 255 

Aen Zle Leu Val Cya Aap Gly Ser Leu Ala Leu Ala Trp Hia Aan Ara 
260 265 270 

Pro Leu Tyr Vai Ala Thr Ala Trp Cy« Val Thr Gly Gly Aan Ala Ala 

275 280 285 

Ser Ser Met Val Gly Aan Zle Cya Lya Gly Thr Aan Aap Ser Arg Ara 
290 295 300 

Lya Glu Aan Arg Aan Gly Pro Met Glu 

305 310 

(2) ZNPORMATZON POR SEQ ZD NO: 28: 

(i) SBQXIENCE CHARACTERISTICS: 

(A) LENGTH: 611 baae pairs 

(B) TYPE: nucleic acid 

(C) STRAKDBDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPEx DNA (genomic) 



(xi) SEQUENCE DESCRZPTZON: SEQ ID HO: 28: 



CCCAACTTGG 


GAAGCCCTTC 


CTCCGCTCCO 


CCTCCTACCT 


CAAGGAGGCC 


CTCCTCCTCC 


60 


CACTCGCCGA 


CAGCCACCAT 


GGCTCCTCOG 


GCGTCACCTC 


GCOGCTCGAC 


GTTGCCCTCA 


120 


ACCTTGCAGC 


ATACAAGTCT 


TTCTCTGACC 


TGTCACCTGT 


GCTCCAGTTC 


ACTAACTTTA 


180 


CCGCAACAAG 


GCGCTTCTTG 


ATGAGATTGO 


TOGCATGGCA 


ACTTCCT6CA 


TCCATGTCAT 


240 


TGACTTTGAT 


CTCGGTGTTG 


GTGGTCAGTG 


GCCTTCCTTC 


TTGCAGGAGC 


TTGCCCACCG 


300 


COGGGGAGCT 


GGAGGTATGG 


CCTTGCCGTT 


GTTOAAGCTC 


ACGGCTTTCA 


TGTOGACTGC 


360 


TTCTCACCAT 


CCACTGGAGC 


TGCACCTTAC 


CCAGGATAAC 


CTCTCTCAOT 


TTGCCGCAGA 


420 


GCTCAGAATT 


CCTTTOGAAT 


TCAATGCCGT 


CAGTCTTGAT 


GCATTCAATC 


CTGCGGAATC 


480 


TATTTCTTCC 


TCTGGTGATO 


AAGTTOTTGC 


TGTTAOOCTC 


CCTQTTGGCT 


CCTCTGCTCG 


540 


TGGACCACCG 


CTGCCAOCGA 


TTCTTCGOTT 


OGT6AAACAG 


CTTTGTCCTA 


AGGTTGTCGT 


600 
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GGCTATTCAT C 611 
(2) ZNFORMIiTION FOR 8SQ ID NOs29t 

(1) SBQUBNCB CaXRACTBRXSTZCSi 

(A) LBNGTHx 502 bas« pairs 

(B) TYPBt nucleic acid 

(C) STRANDBDNBSSi unknovm 

(D) TOPGLOOys unknotm 

(11) MOLECULE TYPE I CDNA 



(Xl) SEQUENCE DESCRIPTIONS SEQ ID NOi29: 

TTT ' I ' T T T TTr yYyx"f ' X ' !l TTT TTTTTTTTTT TACAOAGCAA CA6CAGTATA ATATTAATTC 60 

TGTACCACAC AACCATTTGA TAGGTTAAAT TACCCTCTAG TCTCTACTCA TAAGCAOTGT 120 

TTCCAATGAC AT6ATCATGG CTAATT6AGC A6AGCATGGC AACAACCTAA AGCAACATCA 180 

TTAGCTATAG AGACTGACAC CAATATTOCT AAATCCACTA 6GCTAGCTAA TAAOCTGCAA 240 

CGAAAAGCAA TATGAAGAGT TCAACAGCTC AAGACAACAA TTTCATTTGC AACATTTAAT 300 

T CAAGAATA AATGGACATT ACTGGAGTGG TCCATCCTTC CAAACGGTGG TGGAACCTTG 360 

GTGGAGTGAA GCTTATOGCT GATCAGCACC GCCAAGATGA TATOGATACA AGCTCCCCAC 420 

CCTGCCAGTA GAGCOTAAGA GCAGCTCOGC GTTTCTCCAC ATGCAATCCT CGGACCTGCA 480 

CCCGCTTCAG GAGGCAGTCT GC 502 
(2) INFORMATION FOR SEQ ID NO: 30: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 298 amino aclda 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: unknown 

(11) MOLECULE TYPE: peptide 



(xl) SEQtlBNCE DESCRIPTION: SEQ ID NO: 30: 

Pro Oln Gin Gin Gin Gin His Gin Gin Oln Gin Gin Gin Hie Lya Pro 
15 10 15 

Pro Pro Pro Pro lie Gin Gin Gin Glu Arq Glu Aan Ser Ser Thr Asp 
20 25 30 

Ala Pro Pro Gin Pro Glu Thr Val Thr Ala Thr Val Pro Ala Val Oln 
35 40 45 

Thr Asn Thr Ala Glu Ala Leu Arg Glu Arg Lya Glu Glu He Lys Arg 

50 55 60 

Gin Lys Cln Asp lu Glu Gly Leu His Leu X«eu Thr Leu Leu Leu Oln 

65 70 75 80 

eye Ala Glu Ala Val Ser Ala Asp Asn Leu Glu Glu Ala Asn Lys Leu 

85 90 95 
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Leu I^u GIu I la Bmr Gin Imu Smr Thr Pro Tyr ly Thr Ser Al* In 
100 105 110 

Arg Val Ala Ala Tyr Phe Ser 61u Ala Met Ser Ala Arg Leu Leu Asn 
115 120 125 

Ser Cye Lau Cly lie Tyr Ala Ala Leu Pro Ser Arg Trp Met Pro Gin 
130 135 140 

Thr Hie Ser Leu Lye Met Val Ser Ala Phe Gin Val Phe Aen Gly He 

150 155 160 

Ser Pro Leu Val Lya Phe Ser His Phe Thr Ala Aen Gin Ala He Gin 

165 170 175 

Glu Ala Phe Glu Lya Glu Aep ser Val Hie He He Aap Leu Aap He 

180 185 190 

Met Gin Gly Leu Gin Trp Pro Gly Leu Phe Hie He Leu Ala Ser Arg 

195 200 205 

Pro Gly Gly Pro Pro Hie Val Arg Leu Thr Cly Leu Gly Thr Ser Met 
210 215 220 

Glu Ala Leu Gin Ala Thr Gly Lys Arg Leu Ser Aap Phe Thr Aep Lye 

225 230 235 240 

Leu Gly Leu Pro Phe Glu Phe Cya Pro Leu Ala Glu Lys Val Gly Aan 
245 250 255 

Aep Leu Thr Glu Arg Leu Aan Val Arg Lys Arg Glu Ala Ala Val Hie 

260 265 270 

Trp Leu Gin His Ser Leu Tyr Asp Val Thr Cly ser Aap Ala His Thr 
275 280 285 

Leu Trp Leu Leu Gin Arg Leu Ala Pro Lya 
290 295 

(2) INFORMATION FOR SEQ ID NOt31: 

(i) SEQUBNCB CHARACTERISTICS t 

(A) LENGTH: 307 amino acida 

(B) TYPE: amino acid 

(C) STRANDBDNE5S s single 

(D) TOPOLOGY: unkno%m 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION i SEQ ID NOi31: 

Cly Thr Ser Pro Thr Gly Pro Glu Leu Leu Thr Tyr Met His lie Leu 
^ 5 10 15 

Tyr Glu Ala Cys Pro Tyr Phe Lys Phe Gly Tyr Glu Ser Ala Asn Gly 

20 25 30 

Ala He Ala Glu Ala Val Lya Asn Glu Ser Phe Val His He He Asp 

35 40 45 

Phe Gin He Ser Gin Gly Gly In Trp Val Ser Leu He Arg Ala Leu 

SO 55 60 

Cly Ala Arg Pr Gly Gly Pro Pro Asn Val Arg He Thr Gly He Asp 
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65 70 75 ao 

ABp Pro Arg Sor Ser Ph« Ala Arg In Gly Gly Leu Clu Leu Val Gly 
85 90 95 

GXn Arg Leu GXy Lye X^u Ala Glu Met Cye Gly Val Pro Phe Olu Phe 
100 105 110 

His Gly Ala Ala Leu Cye Cya Thr Glu Val Glu lie Glu Lye Leu Gly 
115 120 125 

Val Arg Aen Gly Glu Ala Leu Ala Val Asn Phe Pro Leu Val Leu His 
130 135 140 

Hie Met Pro Asp Glu Ser Val Thr Val Glu Aen Hie Arg Asp Arg Leu 

145 150 155 160 

Leu Arg Leu Val Lye Hie Leu Ser Pro Aan Val Val Thr Leu Val Glu 

165 170 175 

Gin Glu Ala Aan Thr Aan Thr Ala Pro Phe Leu Pro Arg Phe Val Glu 
180 185 190 

Thr Met Aan Hia Tyr Leu Ala Val Phe Glu Ser lie Aep Val Lya Leu 
195 200 205 

Ala Arg Aep Hia Lya Glu Arg lie Aen Val Glu Gin Hie Cye Leu Ala 
210 215 220 

Arg Glu Val Val Aan Leu lie Ala Cya Glu Gly Val Glu Arg Glu Glu 
225 230 235 240 

Arg Hia Glu Pro Leu Gly Lya Trp Arg Ser Arg Phe Hia Met Ala Gly 
245 250 255 

Phe Lye Pro Tyr Pro Leu Ser Ser Tyr Val Aan Ala Thr lie Lya Gly 
260 265 270 

Leu Leu Glu Ser Tyr Ser Glu Lya Tyr Thr Leu Glu Glu Arg Aap Gly 
275 280 285 

Ala Leu Tyr Leu Gly Trp Lya Aan Gin Pro Leu He Thr Ser Cya Ala 
290 295 300 

Trp Arg Xaa 

305 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SBQUBNCE CHARACTERISTICS: 

(A) LENGTH: 353 amino acida 

(B) TYPE: amino acid 

(C) STRAMOBDNBSS: aingla 

(D) TOPOLOGY: unknotm 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Leu Ser Met Val Aan Glu Leu Arg Gin He Val Ser He Gin Gly Aap 

IS 10 IS 

Pro Ser Gin Arg He Ala Ala Tyr Met Val Glu Gly Leu Ala Ala Arg 

20 25 30 
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Ala Ala Ser Oly Lya Pha lie Tyr Arg Ala Lau Lya Cya Lya Glu 
35 40 45 

Pro Pr Ser Aep Glu Arg I.au Ala Ala Hat Gin Val Leu Phe Glu Val 

50 55 60 

Cya Pro Cya Phe X^ya Phe Oly Phe Leu Ala Ala Aan Cly Ala Zle Leu 

65 70 75 80 

Glu Ala lie Lya Oly Glu Glu Glu Val Hia lie Zle Aep Phe Aep Zle 

85 90 95 

Aan Gin Oly Aan Gin Tyr Met Thr Leu Zle Arg Ser Zle Ala Glu Leu 

100 105 110 

Pro Cly Lya Arg Pro Arg Leu Arg Leu Thr Gly Zle Aap Aep Pro Glu 

115 120 125 

Ser Val Gin Arg Ser Zle Gly Gly Leu Arg Zle Zle Gly Leu Arg Leu 

130 135 j^4Q ' ^ 

Glu Gin Leu Ala Glu Aap Aan Gly Val Ser Phe Lya Phe Lya Ala Met 

^^5 ISO 155 160 

Pro Ser Lye Thr Sar Zle Val Ser Pro Ser Thr Leu Gly Cya Lye Pro 

165 170 175 

Gly Glu Thr Leu Zle Val Aan Phe Ala Phe Gin Leu Hia Hia Met Pro 
180 185 190 

Aap Glu Ser Val Thr Thr Val Aan Gin Arg Aep Glu Leu Leu His Met 

195 200 205 

Th*^ Val Val Glu Gin Aap Val 
*10 215 220 

Aan Thr Aan Thr Ser Pro Phe Phe Pro Arg Phe Zle Glu Ala Tyr Glu 
225 230 235 240 

Tyr Tyr Ser Ala Val Phe Glu Ser Leu Aap Met Thr Leu Pro Arg Glu 
245 250 255 

Ser Gin Glu Arg Met Aan Val Glu Arg Gin Cys Leu Ala Arg Aap Zle 

260 265 270 

Val Aan Zle Val Ala Cya Glu Cly Glu Glu Arg Zle Glu Arg Tyr Glu 
275 280 285 

«S ^9 ^"^9 H*t Met Met Ala Gly Phe Aan Pro 

2^0 295 300 

Lya Pro Net Ser Ala Lya Val Thr Asn Aan Zle Gin Aan Leu Zle Lya 

310 315 220 

Gin Gin Tyr Cya Aen Lya Tyr Lya Leu Lye Glu Glu Met Gly Glu Leu 

325 330 ' 335 

Hia Phe Cya Trp Glu Glu Lya Ser Leu Zle Val Ala Ser Ala Trp Ara 

340 345 350 

Xaa 

(2) ZNPORMATZON FOR 8BQ ZD NOs33: 

(i) SBQUSNCE CHARACTXRZSTZCSs 

(A) LENGTH: 326 amino acids 
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(B) TYPBi amino acid 

(C) STRANDBDNBSS: single 
(O) TOPOLOGY X unluiown 

(ii) HOLECUU TYPBs paptide 



(xi) SBQUENC8 DBSCRZPTXONt SBQ ID NOs33$ 

Ala Met Glu Gly Glu Lya Net Val Hie Val lie Aep Leu Aap Ala Ser 

IS 10 15 

Glu Pro Ala Gin Trp Leu Ala Leu Leu Gin Ala Phe Aen Ser Arg Pro 

20 25 3D 

Glu Gly Pro Pro His Leu Arg lie Thr Gly Val His His Gin Lye Glu 
35 40 45 

Val Leu Glu Gin Met Ala His Arg Leu lie Glu Glu Ala Glu Lys Leu 
SO 55 60 

Asp lie Pro Phe Gin Phe Asn Pro Val Val Ser Arg Leu Asp Cys Leu 

65 70 75 80 

Asn Val Glu Gin Leu Arg Val Lys Thr Gly Glu Ala Leu Ala Val Ser 
85 90 95 

Ser Val Leu Gin Leu His Thr Phe Leu Ala Ser Asp Asp Asp Leu Met 
100 105 110 

Arg Lys Asn Cys Ala Leu Arg Phe Gin Asn Asn Pro Ser Gly Val Asp 

115 120 125 

Leu Gin Arg Val Leu Met Met Ser His Gly Ser Ala Ala Glu Ala Arg 
130 135 140 

Glu Asn Asp Met Ser Asn Asn Asn Gly Tyr Ser Pro Ser Gly Asp Ser 
145 150 155 160 

Ala Ser Ser Leu Pro Leu Pro Ser Ser Gly Arg Thr Asp Ssr Phe Leu 
165 170 175 

Asn Ala lie Trp Gly Leu Ser Pro Lys Val Met Val Val Thr Glu Gin 
180 185 190 

Asp Ser Asp His Asn Gly Ser Thr Leu Met Glu Arg Leu Leu Glu Ser 
195 200 205 

Leu Tyr Thr Tyr Ala Ala Leu Phe Asp Cys Leu Glu Thr Lys Val Pro 
210 215 220 

Arg Thr Ser Gin Asp Arg lie Lys Val Glu Lys Met Leu Phe Gly Glu 

225 230 235 240 

Glu lie Lys Asn He He Ser Cys Glu Gly Phe Glu Arg Arg Glu Arg 
245 250 255 

His Glu Lys Leu Glu Lys Trp Ser Gin Arg He Asp Leu Ala Gly Phe 
260 265 270 

Gly Asn Val Pro Leu Ser Tyr Tyr Ala Met Leu Gin Ala Arg Arg Leu 
275 280 285 

Leu Gin Gly Cys Gly Phe Asp Gly Tyr Arg He Lys Glu Glu Ser Gly 

290 295 300 
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Cym Aim Val Xl« Cys Trp Oln Amp Arg Pr Leu Tyr Ser Val Ser Ala 

305 310 315 320 

Trp Arg Cye Arg Lye Xaa 

325 

(2) INFORKATimt FOR SXQ ID KOt3~4s 



(i) SBOOXNCB CHARACTBRISTXCSt 

<A) LENGTH: 277 amino acide 

(B) TYPE I amino acid 

(C) STRANDBDNBSSt single 
(O) TOPOLOGY s unknoim 

(ii) MOLBCULB TYPBi peptide 



(si) SBQUBNCE DBSCRZPTIONs SBQ ID NO: 34: 

Asn Lys Arg Leu Lys Ser Cys Ser Ser Pro Asp Ser Met Val Thr Ser 

is 10 15 

Thr Ser Thr Cly Thr Gin lie Gly Gly Val lie Gly Thr Thr Val Thr 
20 25 30 

Thr Thr Thr Thr Thr Thr Thr Ala Ala Ala Glu Ser Thr Arg Ser Val 
35 40 45 

He Leu Val Asp Ser Gin Glu Asn Gly Val Arg Leu Val His Ala Leu 

50 55 60 

Met Ala Cys Ala Glu Ala He Gin Gin Asn Asn Leu Thr Leu Ala Glu 

^5 70 75 80 

Ala Leu Val Lys Gin Zle Gly Cys Leu Ala Val Ser Gin Ala Gly Ala 
85 90 95 

Met Arg Lys Val Ala Thr Tyr Phe Ala Glu Ala Leu Ala Arg Arg lie 

100 105 110 

Tyr Arg Leu Ser Pro Pro Gin Asn Gin lie Asp His Cys Leu Ser Asp 

115 120 125 

Thr Leu Gin Met His Phe Tyr Glu Thr Cys Pro Tyr Leu Lys Phe Ala 
130 135 140 

His Phe Thr Ala Asn Gin Ala lie Leu Glu Ala Phe Glu Gly Lys Lys 

150 155 160 

Arg Val His Val lie Asp Phe Ser Met Asn Gin Gly Leu Gin Trp Pro 

165 170 175 

'^l* **«t Gin Ala Leu Ala Leu Arg Glu Gly Gly Pro Pro Thr Phe 

180 185 190 

Arg Leu Thr Gly lie Gly Pro Pro Ala Pro Asp Asn Ser Asp His Leu 

195 200 205 

Hie Glu Val Gly Cys Lys Lsu Ala Gin Leu Ala Glu Ala lie His Val 

210 215 220 

Glu Phe lu Tyr Arg Gly Phe Val Ala Asn Ser Leu Ala Asp Leu Asp 

"5 230 235 240 

Ala Ser Met Leu Glu Leu Arg Pr S r Asp Thr Glu Ala Val Ala Val 
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24S 250 255 

Aan Smr Val Ph« Glu Vmu Him Lya hmiM Vmu Gly Arg Xaa Gly Gly XI* 

260 265 270 

Glu Ly« Val L«u Gly 
275 

(2) INFORMATION FOR 8BQ ID NOt35: 

(i) SEQUKNCB CHARACTBRISTZCSt 

(A) LENGTOs 262 amino acids 

(B) TYPBs amino acid 

(C> STRANORDNBSSt single 
(D) TOPOLOGY t unknom 

(ii) MOLECULE TYPBt peptide 



(xi) SEQUENCE DBSCRIPTIONt SBQ ID NOt35: 

Gly Gly Gly Gly Asp Thr Tyr Thr Thr Aan Lya Arg Leu Lye Cys Ser 
15 10 15 

Asn Gly Val Val Glu Thr Thr Thr Ala Thr Ala Glu Ser Thr Arg His 
20 25 30 

Val Val Leu Val Aep Ser Gin Glu Aan Gly Val Arg Leu Val His Ala 
35 40 45 

Leu Leu Ala Cya Ala Glu Ala Val Gin Lya Glu Asn Leu Thr Val Ala 
50 55 60 

Glu Ala Leu Val Lys Gin lie Gly Phe Leu Ala Val Ser Gin lie Gly 
65 70 75 BO 

Ala Met Arg Gin Val Ala Thr Tyr Phe Ala Glu Ala Leu Ala Arg Arg 
85 90 95 

lie Tyr Arg Leu Ser Pro Ser Gin Ser Pro lie Asp His Ser Leu Ser 
100 105 110 

Asp Thr Leu Gin Met His Phe Tyr Glu Thr Cya Pro Tyr Leu Lys Phe 
115 120 125 

Ala His Phe Thr Ala Asn Gin Ala lie X^eu Glu Ala Phe Gin Gly Lys 
130 135 140 

Lys Arg Val His Val lie Asp Phs Ser Met Ser Gin Gly Leu Gin Trp 
145 150 155 160 

Pro Ala Leu Met Gin Ala Leu Ala Leu Arg Pro Gly Gly Pro Pro Val 
165 170 175 

Phe Arg Leu Thr Gly lie Gly Pro Pro Ala Pro Asp Asn Phe Asp Tyr 
180 185 190 

Leu His Glu Val Gly Cys Lys Leu Ala His Leu Ala Glu Ala lie His 
195 200 205 

Val Glu Phe Glu Tyr Arg Gly Phe Val Ala Asn Thr Leu Ala Asp Leu 
210 215 220 

Asp Ala Ser Met I«eu Glu Leu Arg Pro Ser Glu lie Glu Ser Val Ala 
225 230 235 240 
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Val Aen Ser Val Phe Glu L«u Hia Lya Lau Lau Gly Arg Pr Oly All 

245 250 255 

Zla Aap Lya Vai X^u Gly 

260 

(2) INFORMATION FOR SBQ ID NOs36i 

(i) SBQUBNCB CKARACTBRISTICSt 

(A) LBNOTHs 203 amino acida 

(B) TYPBt aaino Acid 

(C) 8TRAMDBDNBSS 3 alngla 

(D) TOPOLOGY t unkncmn 

(ii) HOLBCtTLB TYPBt peptide 



(xi) SBQUBNCB DESCRIPTIONS SBQ ID NOs36s 

Gin Lau Gly Lya Pro Phe Leu Arg Ser Ala Ser Tyr Leu Lya Glu Ala 
^5 10 15 

Leu Leu Leu Ala Leu Ala Aap Ser Hia Hia Oly Ser Ser Gly Val Thr 

20 25 30 

Ser Pro Leu Aap Val Ala Leu Lya Leu Ala Ala Tyr Lya Ser Phe Ser 
35 40 45 

Aap Leu Ser Pro Val Leu Gin Phe Thr Aen Phe Thr Ala Aan Lya Ala 
50 55 eo 

Leu Leu Aap Clu He Gly Gly Met Ala Thr Ser Cya He Hia Val He 
*5 70 75 80 

Aep Phe Aen Leu Gly Val Gly Gly Gin Trp Ala Ser Phe Leu Gin Glu 
65 90 95 

Leu Ala Hia Arg Arg Gly Ala Gly Gly Met Ala Leu Pro Leu Leu Lya 

100 105 110 

Leu Thr Ala Phe Met Ser Thr Ala Ser Hia Hia Pro Leu Glu Leu Hia 

115 120 125 

Leu Thr Gin Aap Aan Leu Ser Gin Phe Ala Ala Glu Leu Aro He Pro 

130 135 140 

Phe Clu Phe Aan Ala Val Ser Leu Aap Ala Phe Aan Pro Ala Glu Ser 

1*5 150 155 160 

He Ser Ser Ser Gly Aap Glu Val Val Ala Val Ser Leu Pro Val Gly 
165 170 175 

Cya Ser Ala Arg Ala Pro Pro Leu Pro Ala He Leu Arg Leu Val Lva 

180 185 190 

Gin Leu Cya Pro Lya Val Val Val Ala He Aap 

195 200 

(2) INFORMATION FOR SBQ ID NOt37i 

(i) SBQUBNCB OiARACTBRISTICS: 

(A) LENGTH: 131 amin acida 

(B) TYPBt amin aoid 

(C) STRAHDBDNBSS: eingle 

(D) TOPOLOGY t unknown 
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(11) MOLBCULB Tint paptld* 



(xl) SEQUBNCB DESCRIPTIONS 8BQ XD NOt37t 

HLm Ala Sttr Val hym Gly Tyr Aan Hla Val His lie ZI0 Asp Phe Smr 

1 5 10 15 

Vmu M9t Gin Gly hm%x Gin Trp Pro Ala Lau Mat Aap Val Pha Bar Ala 

20 25 30 

Ara Glu Gly Gly Pro Pro Lya Lau Arg Ila Thr Gly lie Gly Pro Aan 
35 40 45 

Pro Ila Gly Gly Arg Aap Glu Leu Hla Glu Val Gly lie Arg X^u Ala 
SO 55 60 

Lye Tyr Ala Hie Sar Val Gly Ila Aap Pha Thr Pha Gin Gly Val Cya 

65 70 75 80 

Val Aap Gin Leu Aap Arg Leu Cya Aap Trp Met Leu Leu Lya Pro He 

85 90 95 

Lya Gly Glu Ala Val Ala Ila Aan Ser Ila Leu Oln Leu Hla Arg Leu 
100 105 110 

I^u Val Aap Pro Aap Ala Aan Pro Val Val Pro Ala Pro He Aap He 

115 120 125 

Leu t»eu Lya 
130 

(2) INFORMATION FOR SBQ ID NO :38s 

(i) SEQUENCE CKIUtACTBRISTICS: 

(A) LENGTHS 33 amino acida 

(B) TYPES amino acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGY s unknown 

(11) MOLECULE TYPES peptide 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO:38s 

Oln Glu Ala Pha Glu Arg Glu Glu Arg Val His He He Asp Leu Aap 
15 10 15 

He Met Gin Gly Leu Gin Trp Pro Gly Leu Phe His He Leu Ala Ser 

20 25 30 

Arg 



(2) INFORMATION FOR SEQ ID NO s 39s 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 29 amino aclda 

(B) TYPES amln acid 

(C) STRANDEDNESSs alngl 

(D) TOPOLOGY! unknotm 

(11) MOLECULE TYPES peptide 
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(xi) 8BQUBNCB DBSCRZPTIONt SBQ ID NOt39t 

Pho Ala Gly Cym Arg Arg V&l His Val Val Aap Phe Gly lie X.ya Gin 
15 10 15 

Gly Itet Gin Trp Pro Ala I*au Leu Xaa Asp Lau Ala X#au 

20 25 

(2) INFORMATION FOR S8Q ID NOt40t 

(1) 8BQUBNCB CHARACTKRISTICSs 

(A) LENGTHS 73 aniino acida 

(B) TYPE: amino acid 

(C) STRANDBDNESSt single 

(D) TOPOLOGY s unknoum 

(ii) KOLBCOLB TYPE: peptide 



(xi) SBQUENCE DESCRIPTION: SBQ ID NO}40t 

Gly Arg Asn Gly Arg Thr Leu Trp X«eu Gly Glu Gly His lie Asp Leu 
IS 10 15 

Trp Pro Leu Gin Gly Leu Leu Ser Gin Gly Leu Gin Arg Ala Leu Cys 
20 25 30 

Ala Arg Pro Leu Gly Ala Pro His Val Phe Leu Pro Gly Leu His Thr 
35 40 45 

Leu Ser Leu Gly Leu Gin Xaa Arg His Leu Leu Val His Net Met Ala 
50 55 60 

Leu Ser Tyr Ser Tyr Gly Arg Xaa Pro 
65 70 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SBC2UBNCE CHARACTERISTICS: 

(A) LENGTH: 59 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 



(xi) SBQUENCE DESCRIPTION: SBQ ID NO:41: 

Thr Ser Asp Ser Ala Ser Ser Phe Asn lie Pro Thr Ser Ala Gin Asn 

Is 10 15 

His Tyr Ala Thr Gly Ser Phe Ser Thr Asn Ser Arg Thr Thr Asn Val 
20 25 30 

Ala Thr Ala Thr Thr Asn Ser Ala Thr Ala His Trp Val Ala Thr Asp 
35 40 45 

Ala Glu His Thr Asp Thr lie II Ala Gin Pr 

SO 55 
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(2) INFORMATION FOR SBQ ID NOs42t 

(1) SBQUfiNCB CHARACTBRISTZCSs 

(A) LBN6TH: 110 amin acids 

(B) TYPBs amin acid 

{C) STRANOBONB8S I aingla 
(D) TOPOLOGYs uakfioim 

(il) HOLBCUUE TYPBs peptida 



(xi) SEQUBNCB DBSCRIPTIONt SBQ ID NOi42s 

Arg Xaa Phe Aap Ser X#au Glu His Asp Ala Sar Lya Gly Olu Pro Arg 

IS ID 15 

Glu Aap Glu Arg Gly Arg Xaa Cya Lau Ala Arg Aan IXa Val Aan Ila 

20 25 30 

Val Xaa Cya Lya Xaa Glu Glu Arg He Glu Arg Tyr Glu Val Thr Gly 
35 40 45 

Lya Trp Arg Ala Arg Met Mat Mat Ala Gly Phe Ser Pro Arg Pro Met 

SO 55 60 

Ser Gly Arg Val Thr Ser Aan He Glu Ser Leu He Lya Arg Asp Tyr 
65 70 75 80 

Cya Ser Lya Tyr Lya Val Lya Glu Glu Met Gly Glu Leu His Phe Ser 
85 90 95 

Trp Glu Glu Lye Ser Leu He Val Ala Ser Ala Trp Ser Xaa 
100 105 110 

(2) INFORMATION FOR SBQ ID NOt43t 

(i) SEQUBNCB CKARACTBRISTICSt 

(A) LBNGTHt 137 amino acidfl 

(B) TYPB: amino acid 

(C) STRANDBDNBSSt single 
<D) TOPOLOGY: unknown 

(ii) MOLECULE TYPBx peptide 



(xi) SEQUENCE DESCRIPTIONS SBQ ID NOs43: 

Aan Gly Ser Tyr Aan Ala Pro Phe Phe Val Thr Arg Phe Arg Glu Ala 

15 10 15 

Leu Phe Hie Tyr Ser Ala He Phe Aap Met Leu Glu Thr Aan He Pro 

20 25 30 

Lya Asp Aan Glu Gin Arg Leu Leu He Glu Sar Ala Leu Phe Ser Arg 
35 40 45 

Glu Xaa Aan Val He Ser Cya Glu Gly Leu Glu Arg Met Glu Arg Pro 
50 55 60 

Glu Thr Tyr Lya Gin Trp Gin Val Arg Aan Gin Arg Val Gly Phe Lya 
65 70 75 80 

Gin Leu Pro L u Aan Gin Aap Met Met Lya Arg Ala Arg Xaa Glu ly 
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85 90 95 

Gin V«l L«u Pro Thr Krg Thr Ph« lie Xl« Aap Clu Asp Asn Ara Trp 

100 105 110 

Lau Vmu Oln Oly Trp hym Oly Arg lie hmu Phe Ala Lau Sar Thr Trp 
115 120 125 

l*y« Pro Asp Asn Arg s«r Bmr Smr Xaa 

130 135 

(2) INFORMATION FOR SEQ ID NOs44s 

(1) SB00ENC8 CHARACTBRISTICSt 

(A) LCNQTHi 41 anino acids 

(B) TYPEt amino acid 

(C) STRANDBDNBSSs single 
<0) TOPOLOOy: unknown 

(ii) MOUCULB TyPEt paptids 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOi44: 

Asn Cly Cly Ala Phe Ala Pro Ser Thr Trp Thr Ala Arg Sar Leu Asn 

^ 5 10 15 

Qly Cly Ala Phe Ala Pro Ser Thr Trp Thr Ala Arg Ser Leu Pro Val 
20 25 30 

Pro Ser Ser Pro Ser Thr Asp ser Phe 
35 40 

(2) INFORMATION FOR SEQ ID NOi45: 

(i) SEQUENCE CHARACTBRISTICSt 

(A) UNOTHs 1279 base pairs 

(B) TYPEt nucleic acid 
<C) STRANDEONESSt unknom 
<D> TOPOLOGYt unknown 

(ii) MOLECULE TYPES CDNA 



(xi) SEQUENCE DESCRIPTIONS SEQ ID NOs45: 

CCGCCTATCT TCTACOGCCA CCACCACCAT ACACCTCCGC CGGCAAAGCG CCTCAACCCT 60 

GOTCCCGTGC CGATAACACA CCAGCTOCTT AACCCAGCAO AGCTCATAGA GACCGACAOG 120 

TCTCTAGCTC ACOGGATATT CGCGCOGCTC AATCAACAGC TCTCTTCTCC CGTCCCGAA6 180 

OCATTAOAAA GAGCAGCTTT TTACTTCAAA CAAGCTCTCA ATAATCTCCT TCACAACOTC 240 

TCCCAAACCC TAAACCCTTA TTCCCTCATC TTCAAGATOO CTGCTTACAA ATCCTTCTCA 300 

GACATCTCTC CCGTTCTTCA GTTOCCCAAC TTTACCTCCA ACCAAOCCCT CTTAGAOTCC 360 

TTCCATOGCT TOCACOOTCT CCACATCATC OACTTOOATA TCCGCTACG TOCCCAATCO 420 

GCTTCCCTCA TGCAAOAGCT TGTTCTCCOC GACAACOCOG CTCCTCTCTC CCTCAAOATC 480 

ACCGTTTTOG CTTCTCCGGC AACCAOGAC CAGCTCOAAC TTGGCTTCAC TCAACACAAC 540 



- 119 - 



wo 97/41152 



PCTAJS97A>7022 



CTCAA CACT 


TOGCCTCTGA 


GATCAACATC 


TCCCTTGACA 


TCCAAGTTTT 


GAGCTTAGAC 


600 


G7CCTCOOCT 


CGATCTCOTG 


OCCTAACTCG 


TOGGAGAAAG 


AAGCTGTOGC 


CGTTAACATC 


660 


TCCCCC6CGT 


CCTTCTCGCA 


CCTCCCTTTO 


GTCCTCOOTT 


TCGTGAAGCA 


TCTATCTCCG 


720 


ACX3ATCATCG 


TCTGCTCOGA 


CAGAOOATGC 


GAGAGGACOG 


ATCTGCCCTT 


CTCTGAACAG 


780 


CTOGCCCACT 


OGCTOGACTC 


ACACACCOCT 


CTCTTOGAAT 


CCCTCGACGC 


CGTCAAOGCC 


840 


AACCTOGAOO 


CAATCCAOAA 


GATCGAGAGG 


TTTCTTATAC 


AGCCGGAGAT 


ACAGAAGCTG 


900 


GTGTTOGATC 


GTAGCCOTCC 


GATAGAAAGG 


CCGATGATGA 


OGTGGCAAGC 


GATGTTTCTA 


960 


CAGATGGGTT 


TCTCAOCGOT 


GACGCACAGT 


AACTTCAOGG 


AGTCTCAAGC 


CGAGTGTTTA 


1020 


GTCCAAOGGA 


CGCCAGTGAO 


AGGCTTTCAC 


GTOGAGAAGA 


AACATAACTC 


ACTTCTCCTA 


1080 


TGTTGGCAAA 


GGACAGAACT 


CGTCGGAGTT 


TCAGCATGGA 


GATGTCGCTC 


CTCCTGATTT 


1140 


CCACC6GAGT 


TTCAATTATT 


AAAAAAATAT 


TTTCCTTAAT 


TCAATTTATC 


TTAAATGACA 


1200 


AATTTTTAGT 


TTCTGATTTT 


ATTTTGCTCA 


GTGCGATGGA 


TTTTTAAATT 


TAACTTTCAC 


1260 


ACAAATATAT 


AAATTTTTG 










1279 


(2) INFORMATION FOR SEQ ID NOs463 









(i) SEQUBNCB CHARACTERISTICS: 

(A) UNGTH: 379 amino acida 

(B) TITPB: amino acid 

(C) STRANDBDNESS: 0 ingle 
( D ) TOPOXiOG Y : u nkno%m 

(ii) MOLECULE TYPE: peptide 



(xi) SBOUENCE DESCRIPTION: SEQ ID NOs46: 

Ala Ala He Phe Tyr Gly Hia Hia Hia Hia Thr Pro Pro Pro Ala Lya 

15 10 15 

Ara Leu Aan Pro Gly Pro Val Gly He Thr Glu Gin I*eu Val Lya Ala 
20 25 30 

Ala Glu val He Glu Ser Aap Thr Cya Leu Ala Gin Gly He Leu Ala 
35 40 45 

Arg Leu Aan Gin Gin Leu Ser Ser Pro Val Gly Lya Pro Leu Glu Arg 

50 55 60 

Ala Ala Phe Tyr Phe Lya Glu Ala Leu Aan Aan Leu Leu Hia Aan Val 

65 70 75 80 

Ser Gin Thr Leu Aan Pro Tyr Ser Leu He Phe Lya He Ala Ala Tyr 

85 90 95 

Lvo Ser Phe Ser Glu He Ser Pro Val Leu Cln Phe Ala Aan Phe Thr 
' 100 105 110 

Ser Aan Gin Ala Leu Leu lu Ser Phe Hia Gly Phe Hia Arg Leu Hia 

115 120 125 

He He Asp Phe Aap He Gly Tyr Gly Gly In Trp Ala Ser Leu Met 
130 135 140 
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Gin lu Leu Val Lm\x Arg Asp Amn Ala Ala Pr X^u Sar X^u Lye II 

145 ISO 155 leo 

Thr Val Phe Ala 8er Pr Ala Aan Hia Asp Gin Leu Glu Leu Oly Phe 
165 170 175 

Thr Gin Asp Asn Leu Lys His Phe Ala Ser Glu lie Asn lie Ser Leu 

IBO 185 190 

Asp Xle Gin Val Leu Ser I«eu Asp Leu Leu Gly Ser lie Ser Trp Pro 

195 200 205 

Asn Ser Ser Glu Lys Glu Ala Val Ala Val Asn lie Ser Ala Ala Ser 

210 215 220 

Phe Ser His Leu Pro hmu Val Leu Arg Phe Val Lys His Leu Ser Pro 

225 230 235 240 

Thr He He Val Cys Ssr Asp Arg Gly Cys Glu Arg Thr Asp Leu Pro 

245 250 255 

Phe Ser Gin Gin Leu Ala His Ser Leu His Ser His Thr Ala Leu Phe 

260 265 270 

Glu Ser Leu Asp Ala Val Asn Ala Asn Leu Asp Ala Met Gin Lys Xle 

275 280 285 

Glu Arg Phe Leu He Gin Pro Glu lie Glu Lys Leu Val Leu Asp Aro 
290 295 300 

Ser Arg Pro lie Glu Arg Pro Met Met Thr Trp Gin Ala Met Phe Leu 
305 310 315 320 

Gin Met Gly Phe Ser Pro Val Thr His Ser Asn Phe Thr Glu Ser Gin 

325 330 335 

Ala Glu Cys Leu Val Gin Arg Thr Pro Val Arg Gly Phe His Val Glu 
340 345 350 

Lys Lys His Asn Ser Leu Leu Leu Cys Trp Gin Arg Thr Glu Leu Val 

355 360 365 

Gly Val Ser Ala Trp Arg Cys Arg Ser Ser Xaa 

370 375 

(2) INP(»IMATION FOR SEQ ID NOi47t 

(i) SBQUENCB CHARACTERISTICS 2 

<A) LBNGTUt 745 base pairs 
(B) TYPBt nuclsic acid 
<G) STRANDBDNBSS: unknown 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 



(Xi) SBQUENCB DESCRIPTION: SEQ ID NOi47: 

TGCATACAAC GCACCGTTTT TCGTAACACG GTTTCGCGAA GCTCTATTTC ATTTCTCCTC 60 

GATTTTTGAC ATGCTTGAGA CAATTOTGCC ACGAGAAGAC GAAGAGAGGA TGTTCCTTGA 120 

GATGGAOGTC TTTGGGAGAG AGGCACTGAA TOTGATTGCT TGCGAAGGTT GGGAAAGAGT 180 

GGAGAGGCCT GAGACATACA AGCAGTGGCA CGTACGGGCT ATGAGCTCAG GGTTGGTGCA 240 
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GGTTCGATTT OACCGAAGCA TTATOAAGAC ATCGCTOCAT AAGOTCCACA CATTCTACCA 300 

CAAGGATTTT GTGATCGATC AAGATAACCG GTGGCTCTTO CAAGCCTG A AGGGAAGAAC 360 

TGTCATGCCT CAiTCltil"ir GGAAACCAGA OTCCAAGGCT TGACOGAGAA ATOCTCGTTG 420 

GCATATGAGA GACCATCTCT TG A TTTTCTT CCTGTGTAAT TCCCAGAGAC AOAATTACAG 480 

ATGTAAGAAG AGAATGCTGC ACAAAOAACT TOTTGAAAGA TAATATTGAT GTAAGTCCTG 540 

TTTTATAACT TTCTAGCTCT GTTTTTGTTC TTTCTCAGCT AGATTCTCCT AAOGGTATTC 600 

TTGTAGCTAG GOTGATCAGA TTGTTTGTAT ATTGCTAOCA GAGTTAGTTT GTCTAGATTG 660 

TAACACATAT AAGAGGAAGC TTAOAOTTTC TATGOTTTAA AGAGAAGTTT TTTCCTTCTC 720 

CAATGTAAAA AAAAAAAAAA AAAAA 745 
(2) INFX>RMATION FOR 8BQ ID NOs48: 

(i) SBQUBNCE CHARACTBRISTZCS : 

(A) LBNGTHt 134 amino acids 

(B) TYPE: amino acid 

(C) STRANDBONBSS: single 

(D) TOPOLOGY t un)cno«m 

(ii) MOLECULE TYPE: peptide 



(xl) SEQUENCE DESCRIPTION! SBQ ID NOt46: 

Ala Tyr Asn Ala Pro Phe Phe Val Thr Arg Phe Arg Glu Ala Leu Phe 

15 10 15 

His Phe Ser Ser lie Phe Asp Met Leu Glu Thr He Val Pro Arg Glu 

20 25 30 

Asp Glu Glu Arg Met Phe Leu Glu Met Glu Val Phe Gly Arg Glu Ala 
35 40 45 

Leu Asn Val He Ala Cys Glu Gly Trp Glu Arg Val Glu Arg Pro Glu 
50 55 60 

Thr Tyr Lys Gin Trp Hie Val Arg Ala Met Arg Ser Gly Leu Val Gin 

65 70 75 80 

Val Pro Phe Asp Pro Ser He Met Lys Thr Ser Leu His Lys Val His 
85 90 95 

Thr Phe Tyr His Lys Asp Phe Val He Asp Gin Asp Asn Arg Trp Leu 

100 105 HO 

Leu Gin Gly Trp Lys Gly Arg Thr Val Met Ala Leu Ser Val Trp Lys 
115 120 125 

Pro Glu Ser Lys Ala Xaa 

130 

(2) INFORMATION FOR SBQ ID NOs49s 

(i) SBQUBNCB CHARACTBRISTICSt 

(A) LBNGTHt 775 base pairs 

(B) TYPES nucl ic acid 

(C) STRANDBONBSS X unknown 

(D) TOPOLOGY! unkno%m 
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(xl) SEQUENCE OESCRIPTIONs 8EQ ID N s49: 

AAAAAATGGG AAACCATCAC TCTTGATGAA CTTATGATCA ATCCAGGAGA GACAACGGTC 60 

GTCAACTOCA TTCATCCOTT ACAATACACT CCTOATGAAA CTGTCTCATT AOACTCTCCA 120 

AGAGACACOG TTCTGAA6CT ATTCAOAGAT ATCAATCCTG ACCTCTTtGT OTTTGCAOAO ISO 

ATTAACGGAA T6TACAACTC TCCTTTCTTC ATGACGAGGT TCCGAGAAOC G C TTTTTCAT 240 

TACTCTTCAC TCTTTGACAT GTTTGACACC ACAATACACG CAGAGGATGA GTACAAAAAC 300 

AGGTCACTGT TGGAGAGAOA GTTACTTGTG AGAGAOGOGA TGAGCGTGAT TTCCTGCGAG 360 

GGTGCAGAGC GGTTTGCGAG GCCTGAAACC TACAAGCAAT GGCGAGTTAG GATTTTGACA 420 

GCCGGGTTTA AGCCAGCAAC TATTAGCAAA CAGATCATGA AGGAGGCTAA GGAAATTGTG 480 

AOGAAACGTT ACCATAGAGA TTTTGTGATC GATAGCGATA ACAATTGGAT GCTTCAAGGA 540 

TGGAAAGGAA GAGTCATCTA TGCTTTTTCT TGCTGGAAAC CTGCTGAGAA GTTCACAAAC 600 

AATAATTTAA ACATCTGAAA AATGTTACTT CTCAATTACA TCATTTTTGT TTCCCAATGG 660 

TTTTGTAGAA TATGTTTGAT CCCGTGAGTG GATGCAACTC TTTTTTCCTG CAAGTACATA 720 

TTOTATTCAA ATCCTTGTGG AAATGATAAA TTGTTTAATC AAAAAAAAAA AAAAA 775 

<2) INFORMATION FOR SEQ ID NOsSOt 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 206 amino acids 
(8) TYPE: amino acid 

(C) STRANDEONESS: alngle 

(D) TOPOLOGY! tinknown 

(11) MOLECULE TYPE: paptlda 



(xl> SEQUENCE DESCRIPTION: SEQ 10 NO: 50: 

Lya Lya Trp Glu Thr lie Thr Leu Asp Glu Leu Met lie Asn Pro Gly 

^ 5 10 15 

Glu Thr Thr Val Val Aan Cya lie Hla Arg Leu Gin Tyr Thr Pro Aap 

20 25 30 

Glu Thr Val Ser Leu Aap Ser Pro Arg Aap Thr Val Leu Lya Leu Phe 
35 40 45 

Arg Aap He Aan Pro Aap Leu Phe Val Phe Ala Glu He Aan Gly Net 

50 55 gQ 

Tyr Aan Ser Pro Phe Phe Met Thr Arg Phe Arg Glu Ala Leu Phe Hla 

^5 70 75 80 

Tyr S r Ser Leu Phe Aap Met Phe Aap Thr Thr He Hla Ala Glu Asp 
85 90 95 

Glu Tyr Lye Aan Arg Ser Leu Leu Glu Arg Glu Leu Leu Val Arg Aap 
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100 105 110 

XI* Met Smr Val 11m S»r Cy» Clu Gly Ala Glu Arg Phe Ala Arg Pro 
115 120 125 

Glu Thr Tyr Lys Gin Trp Arg Val Arg lie L«u Arg Ala Gly Phe Lye 
130 135 140 

Pro Ala Thr lie Ser Lye Gin lie Met Lye Glu Ala Lye Glu Zle Val 

145 150 155 160 

Arg Lye Arg Tyr Hie Arg Aep Phe Val He Aep Ser Aep Aen Aan Trp 

165 170 175 

Net Leu Gin Gly Trp Lye Gly Arg Val He Tyr Ala Phe Ser Cye Trp 
180 165 190 

Lys Pro Ala Glu Lye Phe Thr Aan Aan Aan Leu Aen He Xaa 

195 200 205 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SBOUBNCB CHARACTERISTICS: 

(A) LENGTH: 54S baae paire 

(B) TYPE: nucleic acid 

(C) 5TRANDEDNBSS: unknown 
(D> TOPOLOGY: un)cnown 

(ii> MOLECULE TYPE: CDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:Sl: 

AATCGCTTGA ACCGAATTTG GATCCAGATT CGAAAGAAAG GCTOAGAGTG GAGAGAGTGC 60 

TGTTCGGTAG GAGGATTATG GATTTGGTCC GATCAGATGA TGATAATAAT AAACC6GGAA 120 

CCCGGTTTGG GTTAATGGAG GAGAAAGAAC AATGGACAGT GTTGATGGAG AAAGCTGGAT 180 

TTGAGCCGCT TAAACCGAGT AATTACGCCG TTAGCCAAGC GAAGCTGCTA CTATGGAACT 240 

ACAATTATAG TACATTGTAT TCACTTGTTC AATCGGAGCC AGGTTTCATC TCCTTCGCTT 300 

GGAACAATGT GCCTCTCCTC ACCGTTTCCT CTTCGOGTTG ACTACTTGGT CCGATAAGTT 360 

AATCTAGTAT TTTGAGTTAG CTTTTAGAAT TGAATTGTTT GGGGTTAOAT TTGGATCTTT 420 

AATTAGTCTC TAGCCTATTC TCTTACTCTT TTTTGTCTAG TGCTTGGAGT GATGATGGTT 480 

TGTCGTTTAT GTTCATTTGT AATATATATT GTATGTAACA TTTCACTAAA AAAAAAAAAA 540 
AAAAAAAA 

(2) INFORMATION FOR SEQ ID NOt52: 

(i) SEQUENCE CHARACTERISTICS! 

(A) LENGTH: 113 amino acide 

(B) TYPE: amino acid 

(C) STRANDBDNBSS: eingle 

(D) TOPOLOGY: unknown 

<ii) MOLECULE TYPE: peptide 



548 
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(xi) 8BQUBNCB DBSCRIPTIONx SBQ ID MOs52t 

Ser hmu Glu Pr Amn Lau Asp Arg Asp Ser Lya Glu Arg L«u Arg Val 

IS 10 15 

Glu Arg Val Vmu Phm ly Arg Arg lla M«t Asp Lau Val Arg Bmr Asp 

20 25 30 

Asp Asp Asn Asn Lys Pro Gly Thr Arg Phs Gly Lsu Met Glu Glu Lys 
35 40 45 

Glu Gin Trp Arg Val Leu Met Glu Lys Ala Gly Phe Glu Pro Val Lye 
50 55 60 

Pro Ser Asn Tyr Ala Val Ser Gin Ala Lys Leu Leu Leu Trp Asn Tyr 

65 70 75 80 

Asn Tyr Ser Thr Leu Tyr Ser Leu Val Glu Ser Glu Pro Gly Phe lie 
85 90 95 

Ser Leu Ala Trp Asn Asn Val Pro Leu Leu Thr Val Ser Ser Trp Arg 

100 105 110 

Xaa 



(2) INFORMATION FOR SEQ ID NO: 53: 

(1) SBQOENCB CHARACrTBRISTICS: 

(A) LENGTH I 1093 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNBSS : unknovm 

(D) TOPOLOGY: unknotm 

(ii) MOLBCULB TYPE: cDNA 



(xi) SBQUBNCB DESCRIPTION: 8BQ ID NO:53t 



GCGAATGTTO 


AGATCTTGGA 


AGCAATAGCT 


GGGGAAACCA 


GAGTCCACAT 


TATCGATTTT 


60 


CAGATTGCAC 


AGGGATCACA 


ATACATGTTT 


TTGATTCACG 


AGCTTGCGAA 


ACGCCCTGGT 


120 


GGGCCX;CC6T 


TGCTGCGTGT 


GACGGGTGTG 


GATGATTCAC 


AGTCCACCTA 


TGCTCXSTGGG 


180 


GGA6GACTCA 


GCTTGGTAGG 


TGAGAGGCTT 


6CAACTTTGG 


CGCAGTCATG 


TGGTGTCCCG 


240 


TTTGAGTTTC 


ACGATGCCAT 


CATGTCTGGG 


TGCAAGGTGC 


AGCGGGAACA 


TCTCGGGTTG 


300 


GAACCTQGCT 


TTGCTGTTOT 


TGTGAACTTG 


CCATATGTAT 


TACACCACAT 


GCCAGAOGAG 


360 


AGCGTAAGTG 


TTGAAAAATA 


CAGAGACACG 


CTCCTGCATC 


TGATCAAGAG 


CCTCTCCCCA 


420 


AAACTOOTTA 


CTCTAGTAGA 


GCAAGAATCC 


AACACAAACA 


CCTCGCCATT 


GGTGTCAC6G 


480 


TTTGTGGAAA 


CACTGGATTA 


CTACACAGCG 


ATOTTTGAGT 


CGATAGATGC 


AGCACGGCCA 


540 


CGG6ATGATA 


AGCAQAGAAT 


CAGOGGAGAA 


CAACACTGTG 


TAGCAAGAGA 


CATAGTGAAC 


600 


ATGATAGCAT 


OTOAOGAOTC 


AGAGAOA6TA 


OAGAGACAOG 


AGCTACTCGG 


GAAATGGAGG 


660 


GTCAGAATGA 


TGATGOCTGG 


GTTCACOGGT 


TGGCCOGTGA 


GCACATCTGC 


AGOOTTTGCA 


720 


CGA6TGAGA 


TGCTGAAAGC 


TTATGACAAA 


AACTACAAAC 


TGGGAGGCCA 


TOAAGGAGOO 


780 
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CTCTACCTCT TCTCGAXGA6 ACGACCCATC GCTAGATGTT CCGTOTOOAA GCCAAACCCA 



SAO 



AACTATATT6 GCTAAGTTAT AOTGATGATO TTACTTGAO TOOATAAAGA AGAGCACAAC 



900 



AAAAAGAGAT CTOTOGCTOT AAATTTTTTA OGATGTGCAA TCATCTTTTA AGTTGTAACA 



960 



CAACCTAAOT TATATATOTA TACAAACCAA ACCTGGTGGT 



TTOTAAATTG 



1020 



TCATGTGOTT GTGGGTOOOA AGCTAGTAAT OAAATATAAC CAAAACATTG ATTAGGTCAA 1080 
AAAAAAAAAA AAA 1093 
(2) INFORMATION FOR SBQ ID NOs54: 

(i> SBQUSNCX CHARACTBRISTICS: 

(A) UNGTH: 285 amino acida 

(B) TYPBt amino acid 

(C> STRANDBDHBSS t aingia 
<D) TOPOLOGY: unknown 

<ii) MOLBCUU TYPE: paptida 



(xi) SEQUBNCE DESCRIPTIONS SEQ ID NO: 54: 

Ala Aan Val Glu He Leu Glu Ala He Ala Gly Glu Thr Arg Val Hie 
IS 10 15 

He He Aep Pha Gin He Ala Gin Gly Ser Gin Tyr Met Phe Leu He 
20 25 30 

Gin Glu l»eu Ala Lya Arg Pro Gly Gly Pro Pro Leu Leu Arg Val Thr 
35 40 45 

Gly Val Aap Aap Ser Gin Ser Thr Tyr Ala Arg Gly Gly Gly Leu Ser 
50 55 60 

Leu Val Gly Glu Arg Leu Ala Thr Leu Ala Gin Ser Cya Gly Val Pro 
65 70 75 80 

Phe Glu Phe Hia Aap Ala He Met Ser Gly Cys Lya Val Gin Arg Glu 
85 90 95 

Hia Leu Gly Leu Glu Pro Gly Phe Ala Val Val Val Aan Phe Pro Tyr 

100 105 110 

Val Leu Hia Hia Met Pro Aap Glu Ser Val Ser Val Glu Lya Tyr Arg 
115 120 125 

Aap Arg Leu Leu Hia Leu He Lya Ser Leu Ser Pro Lya Leu Val Thr 

130 135 140 

Leu Val Glu Gin Glu Ser Aan Thr Aan Thr Ser Pro Leu Val Ser Arg 
145 150 155 160 

Phe Val Glu Thr Leu Aap Tyr Tyr Thr Ala Met Phe Glu Ser He Aap 

165 170 175 

Ala Ala Arg Pro Arg Aap Aap Lya Oln Arg He Ser Ala Glu Gin Hia 
180 185 190 

Cya Val Ala Arg Aap He Val Aan Met He Ala Cya Glu Glu Ser Glu 

195 200 205 

Arg Val Glu Arg Hia Glu Val Leu Gly Lya Trp Arg Val Arg Met Met 
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210 215 220 

Met Ala Gly Phe Thr Gly Trp Pr Val Ser Thr Ser Ala Ala Pha Ala 
225 230 235 240 

Ala Sar Glu Kat Uau Lya Ala Tyr Aap Lya Aan Tyr Lye Lau Gly Gly 
245 250 255 

Hia Glu Gly Ala Lmx Tyr Lau Pha Trp Lya Arg Arg Pro Kat Ala Thr 

260 265 270 

Cya Sar Val Trp Lya Pro Aan Pro Aan Tyr lie Gly Xaa 
275 280 285 

(2) XNraiMATZON FOR 8BQ ZD NOt55t 

(i) SBQUBNCB CHARACTBRXSTZCSi 

<A) LENGTH: 1928 baaa paira 

(B) TypBi nuclaic acid 

(C) STRANOBDNBSSt unknown 

(D) TOPOLOGY: unlcno%m 

(11) MOLBCULB TYPBs cDHA 



(xi) SBQUBNCB DBSCRIPTIONt S£Q ID NO: 55: 
AAACACTTTA CCAGATTTTC AAGCGGCTCA GAACATCAAC AACAACAACA 
TTTTATAGTC AAGCAGCTCT CAACCCTTTT CTTTCAAGGT CTGTGAAGCC 
CAGAATTTTC AATCTCCCTC GGCCGATGAT TCATCTCACG TCGGTGAATG 
GTTTCGTGGT TCTGCTTCAT CTCACOGTTA CGGTTTACOC GTTCCCAGGT 
ACACGAACAA TCGGATTACG GTTTATTTGG TGGGATCOGA ATGGGAATCG 
TAATAATTAT CCAACATTAA CCGGCGTTCC GTGTATTCAA CX^GTTCAAA 
TGAATCGGAG AACATCTTGA ATACTTTAAG AGAGCTTGAG AAACAGCTTT 
CGATGAGAGT GGTGGTGATG ATGAOGTGTC AGTTATAACA AATTCAAATT 
TCAAAATCTC GTGACTCCGA ACCCGAACCC GAACCCGGTT TTCTCTTTTT 
TTCTTCTTCG TCTTCTTCGC CTTCTACAOC TTCCACGACO ACATCGGTAT 
AACOOTTATO GAAATCGCGA OGGCGATCGC GGAAGGGAAA ACAGAGATAG 
TTTOGCGOGT GTTTCTCAAA CGCCTAATCT TGAGAGGAAT TCAGAGGAGA 
TTTCATGGTO GCTGCGCTTC GATCGAGGAT AGCTTCTCCA GTGACGGAAT 
GAGCATTTA ATCTOGACTC AATTGCTCTA OCAGCTCTCT CCTTGTTTCA 
BACGCCOCO AATCTCGCCA TTCTCGACGC CGCCGATAAC AACGACOGTC 
ACCGCACGTT ATCGATTTC6 ATATCGGAGA AGGTGGACAA TACGTTAACC 
ATTATCCACG OGCCGGAATG OTAAAAGTCA GAGTCAOAAT TCTCCGGTCC 
C6CCGTGGCG AACAACGTTT ACGGATOTTT AGTCGATOAC GCTCCAGAAG 
AGCCGTOGGA OATTTGTTGA GCCAACTCGG TGATCGACTC GGTATCTCCG 



ACAACAACCG 
TCGAAATTAT 
ATATGAGTTT 
CTCAGACGCA 
GGTCGGGTAT 
ACCGGGTTCA 
TAGATGATGA 
CCGATTGGAT 
CACCGAGCTC 
CTTCTACGCA 
CGACGGAGAT 
AGCTTGTTGA 
TGTATGGGAA 
AACTCOGTTT 
GAATGATGAT 
TTCTCCGTAC 
TTAAGATCAC 
A AGGTTAAA 
TAAGTTTCAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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OGTGGTOAOG AGTTTACOAC TCCGTCATCT AATC6T6AA 
CGAGACTTT G GCTGTOAACT TAOCTTTCAA CCTTTATCGT 
CACGGAGAAT CCAAGAOAOG AACTTCTCOG CGOGTGAAG 
TACTCTAGTG GAOCAAOAAA TGAATTOOAA TAOGGOOCOO 
GTGATGOGCG TGTTAOQGTG CGTTGCTTGA OTOOGTOGAG 
TTCCGACOGT OCCAAAGTTG AGGAAGGAAT TGGCCXKSAAG 
CX2AAGGAATC GATOGTATAG AGCGGTGCGA GGTGTTOGGG 
CATGGCTGGG TTTGAGTTAA TGCCATTGAG TCAGAACATA 
TGGAAACC6A GTCCACCCGG GCTTTACCGT TAAACAAGAT 
TTGGATOGGA OGGCCACTCA CTGTCGCATC OGCTTGGCOT 
TTTCTTCTTA TTATTACCAT ATTATTATTA ATTTTCCAGA 
TTCTGATTTT CCGTTTCXSAA AAGTGTAGGA ATCTTATGTA 
TTTATGTTTT TCTAATAATA AAAGAAAGAG TGATTGGCTT 
AAAAAAAA 

(2) INFORMATION FOR SBQ ID NO«S6s 

(i) SEQUENCE CHARACTERISTICS: 

CA) LENGTH: 524 amino aclde 

(B) TYPE: axnlno moid 

(C) STRANDBDNESSi single 

( D ) TOPOLOGY : unlcnown 

<ii> MOLECULE TYPE: peptide 



TCTCTCGOGT 
GTTCCCGACG 
GGACTTAAAC 
TTTTTAOGGA 
TCTACGOTTC 
CTAGTAAACG 
AAATGGOGAA 
GCGGAGTCGA 
AAOGGAGGTG 
TAACTTCACA 
TTATTCTGAT 
ACAAAGAAAA 
CAAAAAAAAA 



GTGATCCCGA 
AAAGOGTATG 
CGCGCGTGGT 
GAGTGAGTGA 
CTAGTACGAA 
CGGTGGCGTG 
TGCGGATGAC 
TGAAGAGTCG 
TGTGCTTTGG 
CACTCTTTTT 
ATTATTATCA 
AAAAAAGACT 
AAAAAAAAAA 



1200 
1260 
1320 
1380 
1440 
1500 
1S60 
1620 
1680 
1740 
1800 
1860 
1920 
1928 



(xi) SEQUENCE DESCRIPTION: SEQ ID NOt56t 

ABp Leu Thr Ser Val Aen Asp Met Ser Leu Phe Gly Gly Ser Gly Ser 

15 10 15 

Ser Gin Ara Tyr Gly Leu Pro Val Pro Arg Ser Gin Thr Gin Gin Gin 
20 25 30 

Gin ser Asp Tyr Gly Leu Phe Gly Gly He Arg Met Gly He Gly Ser 
35 40 45 

Gly He Aan Asn Tyr Pro Thr Leu Thr Gly Val Pro Cya He Glu Pro 
50 55 60 

Val Gin Aan Arg Val Hie Glu Ser Glu Aan Met Leu Asn Ser Leu Arg 

65 70 75 80 

Glu Leu Glu Lya Gin Lou Leu Aap Asp Aap Asp Glu Ser Gly Gly Asp 
85 90 ^5 

Aep ABp Val Ser Val He Thr Asn Ser Aen Ser Asp Trp He Gin Aan 

*^ ^ 100 105 110 

Leu Val Thr Pro Aan Pro Aen Pro Aan Pro Val Leu Ser Phe Ser Pr 

115 120 125 
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Ser Ser Ser Ser Ser 8 r Ser Ser Pro Ser Thr Ale Ser Thr Thr Thr 

130 135 140 

Ser Val Cye Ser Arg Cln Thr Val Met Glu He Ala Thr Ala He Ala 

150 155 160 

Glu Gly Lye Thr Glu He Ala Thr Glu He Leu Ala Arg Val Ser Gin 

165 170 175 

Thr Pro Asn Z^u Glu Arg Aan Ser Glu Glu Lye Leu Val Asp Phe Met 
180 185 190 

Val Ala Ala Leu Arg Ser Arg He Ala Ser Pro Val Thr Glu Leu Tyr 

195 200 205 

Gly Lye Glu His Leu He Ser Thr Gin Leu Leu Tyr Glu Leu Ser Pro 

210 215 220 

Cya Phe Lye Leu Gly Phe Glu Ala Ala Asn Leu Ala He Leu Asp Ala 

225 230 235 240 

Ala Aep Asn Aen Aep Gly Gly Met Met He Pro Hie Val He Aap Phe 

245 250 255 

Aep He Gly Glu Gly Gly Gin Tyr Val Aan Leu Leu Arg Thr Leu Ser 
260 265 270 

Thr Arg Arg Aan Gly Lya Ser Gin Ser Gin Asn Ser Pro Val Val Lye 
275 280 285 

He Thr Ala Val Ala Asn Aan Val Tyr Gly Cys Leu Val Asp Asp Glv 
290 295 300 ^ j 

Gly Glu Glu Arg Leu Lys Ala Val Gly Asp Leu Leu Ser Gin Leu Gly 
305 310 315 320 

Asp Arg Leu Gly He Ser Val Ser Phe Asn Val Val Thr Ser Leu Arg 
325 330 335 

Leu Gly Aep Leu Asn Arg Glu Ser Leu Gly Cys Aep Pro Asp Glu Thr 

340 345 350 

Leu Ala Val Asn Leu Ala Phe Lys Leu Tyr Arg Val Pro Asp Glu Ser 
355 360 365 

Val Cya Thr Glu Asn Pro Arg Asp Glu Leu Leu Arg Arg Val Lys Gly 

370 375 380 

Leu Lye Pro Arg Val Val Thr Leu Val Glu Gin Glu Net Asn Ser Asn 

390 395 400 

Thr Ala Pro Phe Leu Gly Arg Val Ser Glu Ser Cye Ala Cys Tyr Gly 
405 410 415 

Ala Leu Leu Glu Ser Val Glu Ser Thr Val Pro Ser Thr Asn Ser Asp 
420 425 430 

Arg Ala Lys Val Glu Glu Gly He Gly Arg Lys Leu Val Aen Ala Val 
435 440 445 

Ala Cys Glu Gly He Asp Arg He Glu Arg Cys Glu Val Phe Gly Lys 
450 455 

Trp Arg Met Arg Met Ser M t Ala Gly Phe lu Leu Met Pro Leu ser 

470 475 

Glu Lys II Ala Glu Ser M t Lys Ser Arg Gly Asn Arg Val Hie Pr 
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485 490 495 

Gly Ph« Thr Val Lym GIu Asp Asn Gly Gly Val Cys Pha Gly Trp Mat 
500 505 510 

Gly Krg Ala L«u Thr Val Ala Ser Ala Trp Arg Xaa 
515 520 

(2) INPORMATZCm FOR SBQ ID MOs57t 

(i) SBQUBNCB CHARACTBRISTXCSs 

(A) LXNOTHt 2635 baaa paira 

(B) TYPSs nuclaic acid 

(C) STRAMDBONBSSs unknown 

(D) TOPOIMTs unknown 

(11) HOLSCULB TYPE: cDNA 



<xi) SBQUBNCB DBSCRIPTIONt SBQ ID NO:57t 

TCTTACTCAA GCTTCTTCTT TOTCATCTTG TTGCCGAATC CACAAAGAGG AGAATAAAGA 60 

TTOOACCTTT ATTAGATATT AACGACTCTG GATTTTTGCG TTTTTGGAGT TGGATCCACA 120 

TGGGTTCTTA TCOGGATGGA TTCCCTGGAT CCATGGAOGA GTTGGATTTC AATAAGGACT 180 

TTGATTTOCC TCCCTCCTCA AACCAAACCT TAGCTTTACC TAATGGGTTC TATTTAGATG 240 

ACTTAOATTT CTCATCCTTG GATCCTCCAG AGGCATATCC CTCCCAGAAC AACAACAACA 300 

ACAACATCAA CAACAAAGCT OTAGCAGGAG ATCTCTTATC ATCTTCATCT GATGACGCTG 360 

ATTTCTCTGA TTCTCTTTTO AAGTATATAA GCCAAGTTCT TATGGAAGAG GATATGGAAG 420 

A AAGCCTTG TATGTTTCAT OATGCTTTGG CTCTTCAAGC TGCTGAGAAA TCTCTCTATG 480 

AGGCTCTTGG TGAGAAAGAC CCTTCTTOGT CTTCTGCTTC TTCTGTGGAT CATCCTGAGA 540 

ATTGGCTAG TCATAGCCCT GACGGTTCTT GTTCACCTGC TGCTTTTAGT GATTACGCTA 600 

GCACCACTAC CACTACTTCC TCTGATTCTC ACTGGACTGT TGATCGTTTG GAGAATAGAC 660 

CTTCTTGCTT ACATACACCT ATGCOGAGTA ATTTTCTTTT CCACTCTACT TCTAGGTCCA 720 

ACAGTGTCAC OGGTGGTGGT GGTGGTGGTA ATAGTGOOCT TTACGGTTCA GGTTTTGGCG 780 

ATGATTTOGT TTCGAATATG TTTAAAGATC ATGAATTGCC TATGCACTTC AAGAAAGGGG 840 

TTGAGGAAGC TAGTAAGTTC CTTCCTAAGT CTTCTCAOCT CTTTATTGAT GTGGATAGTT 900 

ACATCCCTAT GAATTCTGGT TCCAACGAAA ATGGTTCTGA GGTTTTTGTT AAGACGGAGA 960 

AGAAAGATCA GACAGAGCAT CATCATCATC ATAGCTATGC ACCACCACCC AACAOATTAA 1020 

CTOGTAAOAA AAGCGATTGO OGCGAOGAAG ATGAAGATTT CGTTOAAGAA AGAAGTAACA 1080 

AOCAATCAGC TGTTTATGTT GAGGAAAGGG ACCTTTCTGA AATGTTTGAT AACATOTTCC 1140 

TATGTGGCCC TGGGAAACCT GTATGCATTC TTAACCAGAA CTTTCCTACA GAATCCCCTA 1200 

AAGTOGTGAC OOCACAGTCA AATGGAGCAA ACATTOCTGG GAAGAAATCA ACTTCTACTA 1260 

GTCATAGTAA CGATTCTAAG AAAGAAACTG CTOATTTGAG OACTCTTTTG CTGTTATOTG 1320 
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CACAAOCTGT ATCAGTOOAT ATOOTAOAA COGCCAAOOT TTAGCTAAGG CAOATACGAG 1380 

AGCATTCTTC GCCTCTAGGC AATGGTTCAG AGOGGTTGGC TCATTATTTT GCAAATAOTC 1440 

TTOAAOCACX; CTTAGCTGGO ACOG6TACAC ACATCTACAC CGCTTTATCT TCGAAGAAAA 1500 

CGTCTOCAGC AGACATOTTO AAGOCTTACC AOACATACAT OTCGGTCTGC CCTTTGAAGA 1560 

AAGCTGCTAT CATATTTGCT AACCACAGCA TGATGOGTTT CACTGCAAAC GCGAACAOGA 1620 

TCCACATAAT AGATTTOGGA ATATCTTACG CTTTTCACTG CCCTCCTCTG ATTCATOCCC 1680 

TCTCGCTCAG CAGACCTGOT GQTTCGCCTA AGCTTGGAAT TACOGGTNNN NNNNNNNNNN 1740 

NNNNNNHNNN NNNNNNIfllNN NNNGAGTTCA GGAGACAGGT CATOGCTTGG CTOGATACTG IBOO 

TCAGCGACAC AATCTTCOGT TTGAGTACAA CGCAATTGCT CAGAAATGGG GAAACGATCC 1860 

AAGTCGAAGA CTTAAAGCTT CGACAAGGAG AGTATGTOGT TGTGAACTCT TTGTTCCOTT 1920 

TCAGGAACCT TCTAOATOAG ACOGTTCTGG TAAACAGCCC GAGAGATGCA GTTTTGAACC 1980 

TGATAAGAAA AATAAACCOO AATGTCTTCA TTCCACCGAT CTTAACCCCG AATTACAACG 2040 

CGCCATTCTT TGTCAOGAGG TTCAGAGAAG CCTTCTTTCA TTACTCGCCT GTGTTTCATA 2100 

TCTCTCACTC 6AAGCTAGCT AGGGAAGACG AGATGAOGCT CATGTATCTG TTTGAGTTTT 2160 

ATCCCAGAGA GATTGTGAAT CTTCTGGCTT CTGAAGCAAC AGAGAGAGTG CACAGCCGAG 2220 

AGACATATAA GCAGTGGCAG GOGAGACTCA TCCGAGCCCG ATTTAGACAG CTTCOGCTTG 2280 

AGAAGGAACT GATGCAGAAT CTGAAGTTGA AAATCGAAAA CCGGTACGAT AAAAACTTCG 2340 

ATGTTGATCA AAACGGTAAC TGGTTACTTC AAGGGTGGAA AGGTAGAATC GTGTATGCTT 2400 

CATCTCTATG GGTTCCTTCG TCTTCATAGA TGTTCTTTCT TAOGTTCTAA GC»ACTGGGA 2460 

TTTATGTACG GCTTTTCTGT TGATAGTCTC TCGCCAACAC GACTGGATTA AGTTCAGAGT 2520 

TAGGGTTCTT GAACACTAGA ATGTTGTTAT ATTATGCTTC TGACATAGCG TGTGTAAGAG 2580 

TCTAGCCTAA GAGATATAGT ACTCATTGCA TGATCTTTTG CTATATGTTN CATGT 2635 
<2) INFORMATION PGR SEQ ID NO: 58: 

(i) SBQUENCB CHARACTERISTICS: 

(A) UNGTHi 809 amino acids 

(B) TTPBi amino acid 

(C) STRANDBDNBSSs aingltt 

(D) TOPOLOGY t unknown 

(ii) MOLBCULB TYPE I peptid* 



<xi) SBQOBNCB DESCRIPTION I SEQ ID NO: 58: 

Leu Leu Lys Val Leu Leu Cye His Leu Val Ala Glu Ser Thr Lys Arg 

5 10 15 

Arg lie Lys I la Arg Pro Leu Leu Asp I la Asn Aap Sar Gly Ph Leu 

20 25 30 

Oly Phe Trp Ser Trp He His Net Gly S r Tyr Pr Aap Gly Phe Pr 
" 40 45 
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Gly Smr Met Aep 61u L«u Aap Ph Kmn Lya Amp Ph« Asp Leu Pro Pro 

50 55 60 

Ser ser Aan Gin Thr Leu Gly L^u Ala Aon Gly Ph Tyr Leu Aep Asp 
65 70 75 80 

Leu Asp Phe Ser Ser Leu Aep Pro Pro Glu Ale Tyr Pro Ser Gin Aen 

85 90 95 

Aan Aen Aen Asn Aen lie Aen Aan Lye Ala Val Ala Gly Aep Leu Leu 

100 105 110 

Ser Ser Ser Ser Aep Aep Ala Aep Phe Ser Aep Ser Val Leu Lys Tyr 

115 120 125 

lie Ser Gin Val Leu Met Glu Glu Asp Met Glu Olu Lye Pro Cya Met 
130 135 140 

Phe Hie Aep Ala Leu Ala Leu Gin Ala Ala Glu Lye Ser Leu Tyr Glu 

145 150 155 160 

Ala Leu Gly Glu Lya Asp Pro Ser Ser Ser Ser Ala Ser Ser Val Aap 
165 170 175 

Hla Pro Glu Arg Leu Ala Ser Hie Ser Pro Asp Gly Ser Cye Ser Gly 
lao 185 190 

Gly Ala Phe Ser Aep Tyr Ala Ser Thr Thr Thr Thr Thr Ser Ser Aap 
195 200 205 

Ser Hla Trp Ser Val Aap Gly Leu Glu Aan Arg Pro Ser Trp Leu Hie 
210 215 220 

Thr Pro Met Pro Ser Aan Phe Val Phe Gin Ser Thr Ser Arg Ser Asn 
225 230 235 240 

Ser Val Thr Gly Gly Gly Gly Gly Gly Aan Ser Ala Val Tyr Gly Ser 
245 250 255 

Gly Phe Gly Aap Aap Leu Val Ser Aan Met Phe Lya Aep Aap Glu Leu 
260 265 270 

Ala Met Gin Phe Lya Lya Gly Val Glu Glu Ala Ser Lys Phe Leu Pro 
275 280 285 

Lys Ser Ser Gin Leu Phe lie Aap Val Aap Ser Tyr lie Pro Met Aan 
290 295 300 

Ser Gly Ser Lya Glu Aan Gly Ser Glu Val Phe Val Lya Thr Glu Lye 
305 310 315 320 

Lya Aep Glu Thr Glu Hie His His His His Ser Tyr Ala Pro Pro Pro 

^ 325 330 335 

Asn Arg Leu Thr Gly Lya Lye Ser Hie Trp Arg Asp Glu Aep Glu Asp 
340 345 350 

Phe Val Olu Glu Arg Ser Aan Lya Gin Ser Ala Val Tyr Val Glu Glu 

355 360 365 

Ser Glu Leu Ser Glu Met Phe Asp Aan Met Phe Leu Cye Gly Pro Gly 
370 375 380 

Lya Pro Val Cya He Leu Aan Gin Aan Phe Pro Thr Glu Ser Ala Lye 
3B5 390 395 400 

Val Val Thr Ala Gin Ser Aan Gly Ala Lys He Arg Gly Lys Lys Ser 
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405 4X0 415 

Thr Ser Thr Ser His Ser Asn Asp Sar Lys Lya Glu Thr Ala Asp l^mu 
420 425 430 

Arg Thr hm\x Leu Val Lmu Cy# Ala Gin Ala Val Ser Val Asp Asp Ara 
435 440 445 

Arg Thr Ala Asn Val Xaa Leu Arg Gin lie Arg Glu His Ser Ser Pro 
450 455 460 

Leu Gly Asn Cly Ser Glu Arg Leu Ala His Tyr Phe Ala Asn Ser Leu 

470 475 480 

Glu Ala Arg x^u Ala Gly Thr Gly Thr Cln Zle Tyr Thr Ala Leu Ser 
465 490 495 

Ser Lya Lye Thr Ser Ala Ala Asp Met Leu Lys Ala Tyr Oln Thr Tyr 

500 505 510 

Met Ser Val Cys Pro Phe Lys Lys Ala Ala lie lie Phe Ala Asn His 

515 520 525 

Ser Met Met Arg Phe Thr Ala Asn Ala Asn Thr Xle His lie He Asp 
530 535 540 

Phe Gly lie Ser Tyr Gly Phe Gin Trp Pro Ala Leu He His Arg Leu 
545 550 555 560 

Ser Leu Ser Arg Pro Cly Cly Ser Pro Lys Leu Arg lie Thr Cly Xaa 
565 570 575 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Glu Phe Arg Arg Cln 

580 585 590 

Val Xle Ala Trp Leu Asp Thr Val Ser Asp Thr Met Phe Arg Leu Ser 
595 600 605 

Thr Thr cln Leu Leu Arg Asn Cly Olu Thr lie Cln Val Glu Asp Leu 
610 615 620 

Lys Leu Arg Gin Gly Glu Tyr Val Val Val Asn Ser Leu Phe Arg Phe 
"5 630 635 640 

Arg Asn Leu Leu Asp Glu Thr Val Leu Val Asn Ser Pro Arg Asp Ala 

545 650 655 

Val Leu Lya Leu lie Arg Lys lie Asn Pro Asn Val Phe lie Pro Ala 
560 665 670 

Xle Leu Ser Gly Asn Tyr Asn Ala Pro Phe Phe Val Thr Arg Phe Aro 
675 660 685 

^"P Cys Asp Ser Lys 

590 695 700 

Leu Ala Arg Clu Asp Glu Met Arg Leu Met Tyr Val Phe Glu Phe Tyr 
705 710 715 720 

Gly Arg Olu lie Val Asn Val Val Ala Ser Glu Gly Thr Olu Arg Val 

725 730 735 

Clu Ser Arg Glu Thr Tyr Lys Cln Trp In Ala Arg Leu lie Arg Ala 
740 745 750 

Cly Phe Arg Cln Leu Pro Leu Clu Lys Clu Leu Met Cln Asn Leu Lys 

755 760 765 
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Leu Ly« Ilo Clu Asn Gly Tyr Amp Ly« Aen Ph hup V«l Amp Oln Asn 

770 775 780 

Cly A«n Trp Leu L u Oln Gly Trp Lya Cly Arg lie Val Tyr Ala Ser 
785 790 795 BOO 

Ser Leu Trp Vel Pro Ser Ser Ser Xaa 

805 

(2) ZNPORKATION FOR SKQ ZD NOi59: 

<i) SBQUXNCB CHARACTBRISTXCS c 

(A) LBNGTH: 90 amino acids 

(B) TYPSs amino acid 

(C) STRAMOSDKSSSi single 
( D ) TOPOLOGY t unknown 

(ii) KOUBCULB TYPBs peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Gin Clu Ala Aep His Aen Lye Thr Gly Phe Leu Asp Arg Phe Thr Glu 
15 10 15 

Ala Leu Phe Tyr Tyr ser Ala Val Phe Aep Ser Leu Asp Ala Ala Aan 
20 25 30 

Aan Aan Aan Aan Aan Aan Aan Gin Arg Met Glu Ala Glu Tyr Leu Gin 
35 40 45 

Arg Clu He Cya Aap He Val Cya Gly Glu Gly Ala Ala Arg Xaa Glu 
50 55 60 

Arg Hia Glu Pro Leu Ser Arg Trp Arg Aap Arg Leu Thr Arg Ala Gly 
65 70 75 80 

Leu Ser Ala Val Pro X^u Gly Ser Aan Ala 
85 90 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 199 base paira 

(B) TyPE: nucleic acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(ii) HOLBCULE TYPE: DKA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Daucue carota 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
TCTGCAGACA ATTTTNAGGA OGCCAATACC ATCCTATTGG AAATTTCAGA ACTGTCCACA 
CCTNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNGTACTTC TCAOAGGNAA TOTCOGNNAG 
ATTAGTTAGC TCCTGCTTAO GAATCTATCC TTCTCTTCCN 6CAACAGTGG TGCCTCCTCA 
TGGTCAGAAA TCK^CTCA 



60 
120 
180 
199 
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(2> INFORMATION FOR SZQ ZD NOs61: 

(i) SBQUBNCB CHARACTERISTICS s 

(A) LENGTH t 66 amino ACids 

(B) TYPEt amino acid 

(C) STRANDBDNB5S: flingl 
(O) TOPOLOGY s unknown 

(ii) MOL6CUL8 TYPBt paptida 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Daucus carota 



(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 61: 

Ser Ala Aap Aan Pha Xaa GIu Ala Aan Thr Met Leu Leu Glu lie Ser 

15 10 15 

Glu Lau Ser Thr Pro Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Tvr 

20 25 30 

Phe Ser Glu Xaa Met: Ser Xaa Arg Leu Val Ser Ser Cya Leu Glv lie 
35 40 45 

Tyr Ala Ser Leu Pro Ala Thr Val Val Pro Pro His Gly Gin Lya Val 
50 55 60 

Ala Ser 



(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 baae pair a 
(B> TYPE: nucleic acid 
CC) STRANDEDNESS: unkno%im 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Glycine max 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:62i 

TCAACTGAGA ATCTAOAAOA TGCCAACAAG ATGCTTCTGG AGATTTCTCA GTTATCAACA 60 

CCGTTCNNCA CTTCAGCACA GCCTGTOCCA GCATATTTCT CAGAAGCCAT ATCAOCAAGG 120 

TTGCTCAGTT CATGTCTAGG GATATACGCA ACTTTGCCAC ACACACACCA AAGCCACAAG 180 

GTAGCTTCAG CTTTTCAAGT GTTCAATGGT ATTAGTCCTT TAGTGCAGTT CTCACACTTC 240 

ACAGCAAACC AAGCAATTCA AOAAGCCTTC GAAAGAGAAG AGACGGTGCA CATCATAGAT 300 

CTTGATATAA TGCAAGGGTT G 32 1 
(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino aeida 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: unknown 

(11) KOLBCULB TYPB: peptide 

(vl) RI IMAL SOURCBt 

(A) ORGANISMS Glycine, max 

(Xl) SBQUENCB DBSCRIPTZONt 8BQ ZD NOs63s 

Ser Thr Glu Aen Leu Glu Asp Ala Aan Lye Met Leu Leu Olu He Ser 

15 10 15 

Gin Leu Ser Thr Pro Phe Xaa Thr Ser Ala Gin Arg Val Ala Ala Tyr 
20 25 30 

Phe Ser Glu Ala He Ser Ala Arg Leu Val Ser Ser Cys Leu Gly Xle 
35 40 45 

Tyr Ala Thr Leu Pro His Thr Hie Gin Ser Hla Lye Val Ala Ser Ala 

50 55 60 

Phe Gin Val Phe Aen Gly He Ser Pro Leu Val Glu Phe Ser Hie Phe 
65 70 75 80 

Thr Ala Aen Gin Ala He Gin Glu Ala Phe Glu Arg Glu Glu Arg Val 
85 90 95 

Hie He He Aep Leu Aep He Met Gin Gly Leu 

100 105 

(2) INFORMATION FOR SBQ ID NOt64: 

(1) SSQtlBNCB CHARACTBRISTICS: 

(A) LBNGTHt 195 baee pairs 
(B> TYPB: nucleic acid 

(C) STRANOBDNESSt unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPB: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISMS Picea 



(xi) SEQUENCE DESCRIPTION z SBQ ID NOs64t 
TCTGCAGACA ACTTTOAAGA AGCCAATAGA ATACTGCCTC AOATCACAGA ACTCTCCACC 60 
CCCTATNGCA ACTCGOTOCA ACGAOTGGCT GCCTATNNNH NNNNMNHNNN NNNNNNNNNN 120 
NNNNNNNNNN NNTGCATAGG AATGTATTCT CCTCTCCCTC CTATTCACAT GTCCCAGAOC 180 
CAGAAAATTG TQAAT 

(2) INFORMATION FOR SBQ ID NOx65: 

(i) SBQUENCB CHARACTERISTICS: 

(A) LBNGTH: 65 amino acida 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
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(vi) ORIGINAL SOtmcSt 

(A) ORGANISM: Plcea 



<xi> 8BQ0SNCS DBSCRIPTION: .SBQ ID NOt6St 

Smr Ala Aap Aan Pha Glu Glu Ala Aan Thr Ila Lau Pro Gin lie Thr 

IS 10 15 

Glu Lau Bmr Thr Pro Tyr Xaa Aan 8ar Val Gin Aro Val Ala Ala Tyr 

20 25 30 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cya Ila Gly Mat 

35 40 45 

Tyr Sar Pro Lau Pro Pro Ila Hia Mat Sar Gin Bar Oln Lya Ila Val 

50 55 60 

Aan 

65 

(2) INFORMATION FOR SBQ ID NOs66: 

(i) SXQUXNCB CHARACTBRISTICSt 

(A) LENGTHi 2151 baaa paira 

(B) TYPE: nuelaic acid 
(C> STRANDBDNBSS s unknown 
(D) TOPOLOGY: unknown 

(ii> MOLBCULB TYPBt cONA 



(Xi) SBQUBNCB DBSCRIPTim: SBQ ID NOs66i 



GATATCAGCA 


TCATCAATTT 


TAAATGTAAG 


TTGGGAAAAG 


ATCATGAGGG 


TTCTCATAGT 


60 


AATTTGGCCA 


CZAAGGTATGA 


CACTGTCTCA 


ATTGAGCAAT 


CTAGTAGAGA 


AACTGATCCA 


120 


TCATATATTC 


CTCATATTGA 


AAGTGAAAAA 


GATATGCTCA 


AGAACCTAGT 


AGAGAAGCTA 


180 


AAAATTGAAA 


AATCTAGCTC 


TACTAGAAAA 


ATATGATAGG 


TTGCCTGTTT 


CTCATGAAAA 


240 


TTTATTAGAT 


AATGATATCA 


TGGCTAGATG 


T06CTCATGA 


GCTTGTTCTT 


GCTAGTTTAG 


300 


ATTCCTOTGO 


GCATTCATCT 


CTTTTAOATG 


CACTAACATG 


ATAGGAAGTT 


TCTAATCTGG 


360 


TGCTTCACAA 


TTCTGGTGAT 


TGATGCTTCC 


TTCATTGCAA 


TTGATATTGA 


TGCTTGATTC 


420 


ATGCTTCAGT 


CACTTTGTGC 


GTTTAATTGG 


TATTGTATGT 


ATCACTAGAT 


TGTAGGGTCT 


480 


CTGCAACTAG 


TGTTTCAOCA 


TGTOOTTTTT 


TAGTATCATT 


OGTATTAGTT 


TCTAACTTTC 


540 


TATTGATATA 


TTAAAOTGAT 


AACTAGTTTT 


A6AAATATTC 


TCTTGTGCCA 


TTAATGCTAC 


600 


AACTTGTTTT 


TAGOOTGTAC 


OTTAOCATTA 


TAATATTTCC 


TTATTATGAA 


AGCOGAAOAG 


660 


AAAOGCGCCC 


AACCAGAOCA 


TCCACGTCOT 


CTCATTTCAC 


CTTCATCGTT 


GGATCATAGA 


720 


TGAGCGGTCC 


ACGGTGAACT 


C06TTTGCCT 


GCAAAACCAC 


GTCCTCTACG 


CGCTGTTAAG 


780 


TAGCTTCTAG 


AAACATGACG 


ATGTGTOCOO 


TCGATTCCTT 


TAGGAGGACC 


CCGATCCGGC 


840 


GCCGCAGTCO 


CCCAAGGTCC 


CGACOGCCGC 


GGCXTTCGGCC 


GCCGOOOCCA 


AGGAOCGGAA 


900 
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GGAGGTOCAO 


OOGCGGAAOC 


AOCGCGAOGA 


GGAGOOCCTC 


CACCTGCTGA 


GTCCTQAOGC 


960 


TGCTOCTGGA 


GTGCGOGGAO 


CCGTGAACG 


OOGACAACCT 


CGACGACGCG 


CACCAGACGC 


1020 


TGCTGGAGAT 


CGCGOAG€?TG 


OCCACGOCX3T 


TCGGGAOCTC 


ACCCAGOGC 


GTGGOOGCCT 


1080 


ACTTOGCGOA 


GGCCATGTOO 


OOGCGCOTCO 


TCAGCTOCTO 


CCTAOGCCTO 


TACGCGCOGC 


1140 


TOCCCCOOGO 


CTCCCCCCCC 


OOGGOOOGCC 


TCCAOGOOOQ 


OOTOOCOOCC 


G06TTCCAGG 


1200 


TGTTCAACX5G 


CATCAOCCCC 


TTCGTCAAGT 


TCTOQCACTT 


GACeOOCAAC 


CAGGCCATCC 


1260 


AOOAGOOGTT 


COAGOGGGAO 


GAGOGTGTGC 


ACATGATCGA 


CCTOOACATC 


ATGCAOGOGC 


1320 


TGCAGTG6CC 


GGGCCTCTTC 


CACATCCTTG 


TCTCCOOCCC 


OOOCGOCOCG 


CCCAGGOTCA 


1380 


GGCTCAC06G 


CCTGOGGOCG 


TCCATOGACG 


COCTOGAOGC 


GAC60OOAAG 


OGCCTCTCOG 


1440 


ACTTOGCCGA 


CACGCTCGGC 


CTGCCCTTCG 


AGTTCTOCGC 


OOTCOCOGAG 


AAGGCCGGCA 


1500 


ACGTTGACCC 


GCAGAAGCTG 


GGCGTCACCC 


GGCGGGAGCC 


COTCGCCGTC 


GACTGGCOGC 


1560 


ACCACTOGCT 


TTACGACGTC 


ATCGGCTOOG 


ACTCCAACAC 


OCTCTGGCTC 


ATCCAAAGGT 


1620 


CCTCCATTTT 


CCTTCTCTCC 


CTTTCTTCCA 


TGTCAAATCT 


TOATGGAATC 


ATGACCACTT 


1680 


TTCAGCTGCT 


GACATTGGAT 


AATGTGAGCT 


TTACGGCAAG 


CATCAAGTGG 


TGGTAGTACA 


1740 


TCCATTACAO 


CTATTTCTAA 


AATATTCTTC 


GOAGGTTTCC 


TGCTCATAGT 


AAAAAAAAAT 


1600 


CGOGTTTTGA 


AGCTCAAAAG 


GCGATTTCTT 


COGAGGTTTG 


CTGTTGAGOG 


CTATTTTOOA 


1860 


AACCCCATTT 


TCTCAATTGA 


TTTTTATTTT 


TTAAAGAAAA 


ATTAGTTCAT 


TTTTCTCTTG 


1920 


TOAAATGGAG 


TCCCAAACTA 


ACCCTAATAT 


TAAAAAAAAC 


GCGCTTTGGA 


GCTCAAAACG 


1980 


CTCGTTGTTA 


TGACCAACCA 


GCTTTATAGG 


TTTAAAAAGG 


TTGAATCTTG 


ACAATGCTTT 


2040 


TGAAAAGGTT 


GAATCTTGAC 


AATGCTTTTG 


AGATGATACT 


GTAGTGTAGT 


CTGTAGTGGA 


2100 


GCATCCTCCA 


TGGTCTTTGG 


TGATCGAGAA 


TTCCTCCAGC 


CCGGGGGATC 


C 


2151 


(2) INFORMATION FOR SEQ ZD NOs67: 









(i) SBQUENCE CHARACTTBRISTXCS: 

<A) LENGTHS 716 anlno acids 

(B) TYPE: amino acid 

(C) STRANOBDNBSSt single 

(D) TOPOLOGY: unknown 

(11) MOLECULE TYPES peptida 



(xi) SEQUBNCB DESCRIPTIONS SEQ ID NO»67t 

Tyr Gin Hia Hia Gin Ph8 Xaa Hat Xaa Val Oly Ly« Arg Sar Xaa Gly 

1 5 10 15 

Pha Ser Xaa Xaa Pha Gly Hia Lya Val Xaa His Cya Leu Aan Xaa Ala 

20 25 30 

I la Xaa Xaa Arg Aan Xaa Bar I la I la Tyr Cya Sar Tyr Xaa Lya Xaa 

35 40 45 

Lya Arg Tyr Ala Gin Glu Pro Sar Arg Glu Ala Lya Aan Xaa Lye I la 
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50 55 60 

Xaa Leu Tyr Xaa Lys hmn Met lie Gly Cys Leu Ph Leu Met Lye XI 

65 70 75 80 

Tyr Xee He He He Ser Trp Leu Asp Val Ala Hie Clu Val Val Leu 
85 90 95 

Ala Ser Leu Aep Ser Cya Oly Hie Ser Ser Leu Leu Aep Ala Lau Thr 

100 105 110 

Xaa Xaa Olu Val Ser Asn Leu Val Leu Hie Aan Ser Oly Asp Ser Cye 

115 120 125 

Phe Leu Hie Cye Aen Xaa Tyr Xaa Cye Leu Zle His Ala Ser Val Thr 

130 135 X40 

Leu Cys Val Xaa Leu Val Leu Tyr Val Ser Leu Aep Cys Arg Val Ser 
^■^5 150 155 160 

Ala Thr Ser Val Ser Pro Cye Oly Phe Lau Val Ser Phe Val Leu Val 
1« 170 175 

Ser Aen Phe Leu Leu He Tyr Xaa Ser Asp Asn Xaa Phe Xaa Lys Tyr 
1^0 185 190 

Ser Leu Val Pro Leu Met Leu Gin Leu Val Phe Ser Val Tyr Val Ser 
195 200 205 

In^ ^y*^ Arg Olu Thr Arg Pro Thr 

210 215 220 

Arg Ala Ser Thr Ser Ser His Phe Thr Phe He Val Gly Ser Xaa Met 

225 230 235 240 

Ser Gly Pro Arg Xaa Thr Pro Phe Ala Cys Lys Thr Thr Ser Ser Thr 
245 250 255 

Arg Cys Xaa Val Ala Ser Arg Asn He Thr Met Cys Pro Val Mis Ser 

260 265 270 

Phe Arg Arg Ser Arg He Arg Arg Arg Ser Arg Pro Arg Ser Arg Pro 
275 280 285 

"^^^ ^'^^ A*"9 Arg Cys Ser Gly 

295 300 

Gly Ser Ser Ala Thr Arg Arg Ala Ser Thr Cys Xaa Val Leu Thr Leu 

310 315 320 

Leu Leu Gin Cys Ala Clu Ala Val Asn Ala Asp Asn Leu Asp Asp Ala 

325 330 335 

His Gin Thr Leu Leu Clu 11m Ala Olu Leu Ala Thr Pro Phe Gly Thr 
340 345 

Ser Thr Gin Arg Val Ala Ala Tyr Phe Ala Glu Ala Met Ser Ala Arg 

355 260 355 

^""^ ^^"^ ^* ^1* P~ Pro Gly Ser 

3'0 375 300 

Pro Ala Ala Ala Arg Leu Hie Gly Arg Val Ala Ala Ala Phe Gin Val 

390 395 

Ph Aen Gly He S r Pr Phe Val Lys Phe S r His Phe Thr Ala Asn 

*05 410 415 
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Oln Ala Zle Cln Glu Ala Pha Olu Arg Clu Olu Arg Val Hie lie lie 
420 425 430 

Asp hmu Asp lie M t In Gly Lsu Oln Trp Pro ly X^u Phe His II 
435 440 445 

Vmu Val Smr Arg Pro Gly Gly Pro Pro Arg Val Arg Jjmu Thr Gly hmu 
450 455 460 

Gly Ala 8ar Mst Asp Ala Lsu Glu Ala Thr Gly Lys Arg Lau Scr Asp 
465 470 475 480 

Phs Ala Asp Thr Lsu Gly Lsu Pro Phs Glu Phs Cys Ala Val Ala Glu 
485 490 495 

Lys Ala Gly Asn Val Asp Pro Gin Lys Leu Gly Val Thr Arg Arg Glu 

500 505 510 

Ala Val Ala Val His Trp Pro His Bis Ser Leu Tyr Asp Val He Gly 

515 520 525 

Ser Asp Ser Asn Thr Leu Trp Leu He Gin Arg Ser Ser He Phe Leu 
530 535 540 

Leu Cys Leu Ser Ser Met Ser Asn Leu Asp Ala He Met Thr Thr Phe 
545 550 555 560 

Gin Leu Leu Thr Leu Asp Asn Val Ser Phe Thr Ala Ser He Lye Ser 

565 570 575 

Trp Xaa Tyr He His Tyr Ser Tyr Phe Xaa Asn He Leu Arg Arg Phe 
580 585 590 

Pro Ala His ser Lys Lys Lys Ser Arg Phe Glu Ala Cln Lys Ala He 
595 600 605 

ser Ser Glu Val Cys Cys Xaa Ala Leu Phe Trp Lys Pro His Phe Leu 
610 615 620 

Asn Xaa Phe Leu Phs Phe Lys Glu Lys Leu Val His Phe Ser Leu Val 

625 630 635 640 

Lys Trp Ser Pro Lys Leu Thr Leu He Leu Lys Lys Thr Arg Phe Gly 
645 650 655 

Ala Gin Asn Ala Arg Cys Tyr Asp Cln Pro Ala Leu Xaa Val Xaa Lys 

660 665 670 

Gly Xaa He Leu Thr Met Leu Leu Lys Arg Leu Asn Leu Asp Asn Ala 

675 680 685 

Phe Glu Met He Leu Xaa Cys Ser Leu Xaa Trp Ser He Leu His Gly 
690 695 700 

Leu Trp Xaa Ser Arg He Pro Ala Ala Arg Gly He 

705 710 715 
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WHAT IS CLAIMED IS: 

!• An isolated nucl ic acid molecule comprising a 
nucleotide sequence that encodes a SCARECROW protein 
5 containing an amino acid sequence substantially similar to 
the sequence of MOTIF III (VHIID) of Arabidopsis SCR protein 
shown in FIGS. 13A-F. 

2. An isolated nucleic acid molecule comprising a 
10 nucleotide sequence that (a) encodes a scarecrow protein 
having the amino acid secjuence shown of any one of SEQ ID 
NO:2, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:34, 
SEQ ID NO:35, SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:41, SEQ 
ID NO:42, SEQ ID NO:43, SEQ ID NO:44, SEQ ID NO:46, SEQ ID 
15 NO:48, SEQ ID NO:50, SEQ ID N0:51, SEQ ID NO: 52, SEQ ID 
NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:59, SEQ ID 
NO:61, SEQ ID NO:63, SEQ ID NO:65 or SEQ ID NO:67; or (b) is 
the complement of the nucleotide sequence of (a) • 

20 3. An isolated nucleic acid molecule comprising a 

nucleotide sequence that hybridizes to the nucleic acid of 
Claim 2 and encodes a naturally occurring SCR gene product. 

4 . A nucleic acid molecule comprising a nucleotide sequence 
25 that (a) encodes a SCR protein lacking one to four of the 
following motifs delineated in FIGS. 13A-F: MOTIF I, MOTIF 
II, MOTIF III, MOTIF IV, MOTIF V, or MOTIF VI; or (b) is the 
complement of the nucleotide seqpience of (a) • 

30 5. A nucleic acid molecule comprising a nucleotide sequence 
that (a) encodes a polypeptide corresponding to MOTIF I, 
MOTIF II, MOTIF IV, MOTIF V or MOTIF VI of the SCARECROW 
protein delineated in FIGS. 13A-F; or (b) is the complement 
of the nucleotide seq[uence of (a) . 

35 

6. Th isolated nucleic acid molecule of Claim 1 comprising 
the nucl ic acid s quence of any on of SEQ ID NO:l, SEQ ID 
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NO: 18, SEQ ID NO:20, SEQ ID NO: 22, SEQ ID NO:45, SEQ ID 

NO: 47, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID 

NO:55, SEQ ID NO:57, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64 
or SEQ ID NO; 66. 

5 

7. A DNA vector containing the nucleotide sequence of Clain 
1, 2, 3, 4, 5, or 6. 

10 8. An expression vector containing the nucleotide sequence 
of Claim 1, 2, 3, 4, 5, or 6 operatively associated with a 
regulatory nucleotide sequence containing transcriptional and 
translational regulatory information that controls expression 
of the nucleotide sequence in a host cell. 

X5 

9. A genetically engineered host cell containing the 
nucleotide sequence of Claim 1, 2, 3, 4, 5, or 6. 

10. A genetically engineered host cell containing the 

2 0 nucleotide sequence of Claim 1, 2, 3, 4, 5, or 6 operatively 
associated with a regulatory nucleotide sequence containing 
transcriptional and translational regulatory information that 
controls expression of the nucleotide sequence in a host 
cell. 

25 

11. An isolated SCARECROW protein. 

12. The protein of Claim 11 having the amino acid sequence 
shown in FIG. 5E (SEQ ID NO:2) . 

30 

13. A SCARECROW protein lacking one to four of the following 
motifs delineated in PIGS. 13A-F: MOTIF I, MOTIF II, MOTIF 

III, MOTIF VI, MOTIF V, or MOTIF VI. 

35 14. A p lyp ptid corr sp nding to MOTIF I, MOTIF II, MOTIF 

IV, MOTIF V or MOTIF VI of the SCARECROW prot in as 
d lineat d in FIGS. 13A-F. 
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15 • An antibody Uiat inxnunospecif ically binds th prot in or 
polypeptid of Claim 11, 12, 13 or 14. 

16. An anti-idiotypic antibody that mimics an epitope of the 
5 SCARECROW protein. 

17. A plant engineered to overexpress or underexpress the 
SCARECROW protein, so that cell division is modified and root 
development is altered 

10 

18. A plant engineered to overexpress the SCARECROW protein, 
so that cell division is increased in roots, resulting in 
thicker root development. 

15 19. A transgenic plant containing a transgene having the 
nucleotide sequence of Claim 1, 2, 3, 4, 5, or 6. 

20. A transgenic plant containing a transgene having the 
nucleotide sequence of Claim 1, 2, 3, 4, 5, or 6 operatively 
20 associated with a regulatory nucleotide secjuence containing 
transcriptional and translational regulatory information that 
controls expression of the nucleotide sequence in a 
transgenic plant cell. 



25 21. The transgenic plant of Claim 19, in which the transgen 
encodes an antisense molecule that suppresses expression of 
endogenous SCARECROW gene product, so that cell division is 
decreased in roots, resulting in thinner root development. 

30 22. A genetically engineered plant in which the endogenous 
SCARECROW gene is disrupted or inactivated so that cell 
division is decreased in roots, resulting in thinner root 
development. 



35 23. A transgenic plant containing a transgen encoding a 
g n f int rest op ratively associat d with a SCARECROW 
promoter, so that the gene of interest is express d in roots. 
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24. Th transgenic plant f Claim 22, in which the gene of 
inter st encod s a gene product that confers h rbicide, salt, 
pathogen, or ins ct r sistanc • 

5 25* A transgenic plant containing a transgene encoding a 
gene of interest operatively associated with a SCARECROW 
promoter, so that the gene of interest is expressed in stems. 

26. The transgenic plant of Claim 25, in which the gene of 
10 interest encodes a gene product that increases starch, lignin 

or cellulose biosynthesis. 

27. A plant engineered to overexpress or underexpress the 
SCARECROW protein so that the stem or hypocotyl gravitropisro 

15 is altered. 

28. The plant of Claim 28, which is less susceptible to 
lodging than a wild-type plant. 

20 



25 



30 



35 
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10 20 30 40 50 

1234567890 P3456789Q mi^im t?MSfi7ftQn 1 234567890 
GGCACGAGCC CAACGGGTCC TGAGCTTCTT ACTTATATGC ATATCTTGTA 50 
GTSP TGP ELL TYMH FLY 

TGAAGCCTGC CCTTATTTCA AATTCGGTTA TGAATCTGCT AATGGAGCTA 100 
EAC PYFK FGY ESA NGAI 

TAGCTGAAGC TGTGAAGAAC GAAAGTTTTG TGCACATTAT CGATTTCCAG 150 
AEA VKN ESFV HII DFQ 

ATTTCTCAAG GTGGTCAATG GGTGAGTTTG ATCCGTGCTC TTGGTGCTAG 200 
ISOG GQW VSL IRAL GAR 

ACCTGGTGGA CCTCCGAACG TTAGGATAAC GGGAATTGAT GATCCGAGAT 250 
PGG PPNV RIT GID OPRS 

CATCGTTTGC TCGTCAAGGA GGACTTGAGT TAGTTGGACA AAGACTTGGG 300 
SFA ROG GLEL VGQ RLG 

AAGCTAGCTG AAATGTGCGG TGTTCCGTTT GAGTTCCATG GAGCTGCTTT 350 
KLAE MCG VPF EFHG AAL 

ATGCTGCACG GAAGTCGAAA TCGAGAAGCT AGGAGTTAGA AATGGAGAAG 400 
CCT EVEI EKL GVR NGEA 

CGCTCGCGGT TAACTTCCCG CTTGTTCTTC ACCACATGCC TGATGAGAGT 450 
LAV NFP LVLH HMP DES 

GTAACTGTGG AGAATCACAG AGATAGATTG TTGAGATTGG TCAAACACTT 500 
VTVE NHR DRL LRLV KHL 

GTCACCAAAC GTTGTGACTC TGGTTGAGCA AGAAGCGAAT ACAAACACTG 550 
SPN VVTL VEO EAN TNTA 

CGCCGTTTCT TCCCCGGTTT GTCGAGACAA TGAACCATTA CTTGGCAGIT 600 
Pf'L PRF VETM NHY LAV 
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10 20 30 40 50 

1234567890 1234567890 1234567890 1234567B90 1234567890 
TTCGAATCAA TAGATGTGAA ACTOGCTAGA 6ATCACAAGG AAAGGATCAA 550 
FESI DVK LAR DHKE RIN 

TGTTGAGCAG CATTGTTTGG CTAGAGAGGT TGTGAATCTT ATAGCTTGTG 700 
VEO HCLA REV VNL lACE 

AAGGTGTTGA AAGAGAAGAG AGGCACGAGC CACTAGGGAA ATGGAGGTCT 750 
GVE REE RHEP LGK WRS 

CGGTTTCACA TGGCGGGATT TAAACCGTAT CCTTTGAGCT CGTATGTGAA 800 
RFHM AGF KPY PLSS YVN 

CGCAACAATC AAAGGATTGC TTGAGAGTTA TTCAGAGAAG TATACACTTG 850 
ATI KGLL ESY SEK YTLE 

AAGAAAGAGA TGGAGCATTG TATTTAGGAT GGAAGAATCA ACCTCTTATC 900 
ERD GAL YLGW KNO PLI 

ACTTCTTGTG CTTGGAGGTA ACTAATAAAA ACCTTGTTCG GTTTCAGAAG 950 
T S C A W R X 

AGATTAGAAA CTTCTTTTAA ACTTTGCAGA ATCTGTTTGT AAAAGTAAAA 1000 

CTCATGCATG ATCCGNAGGA ACAAGTTGTC AAATGTTGTA GTAGTAAGTG 1050 

ATATGTTGAT GACCCAAAAA AAAAAAAAAA AAAAA 1085 
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10 20 30 40 50 

1234567890 1234567890 1234567890 1234567890 1234567890 
GCTATGGAAG GAGAGAAGAT GGTTCATGTG ATTGATCTCG ATGCTTCTGA 50 
AMEG EKM VHV lOLO ASE 

GCCAGCTCAA TGGCTTGCTT TGCTTCAAGG TTTTAACTCT AGGCCTGAAG 100 
PAO WLAL LOA FNS RPEG 

GTCCACCTCA TTTGAGAATC ACTGGTGTTC ATCACCAGAA GGAAGTGCTT 150 
PPH LRl TGVH HQK EVL 

GAACAAATGG CTCATACACT CATTGAGGAA GCAGAGAAAC TCGATATCCC 200 
EOMA HRL lEE AEKL DIP 

GTTTCAGTTT AATCCCGTTG TGAGTAGGTT /«JACTGTTTA AATGTAGAAC 250 
FOF NPVV SRL DCL NVEO 

AGTTGCGGGT TAAAACAGGA GAGGCCTTAG CCGTTAGCTC GGTTCTTCAA 300 
LRV KTG EALA VSS VLO 

TTGCATACCT TCTTGGCCTC TGATGATGAT CTCATGAGAA AGAACTGCGC 350 
LHTF LAS DDO LMRK NCA 

TTTACGGTTT CAGAACAACC CTAGTGGAGT TGACTTGCAG AGAGTTCTAA 400 
LRF ONNP SGV DLO RVLM 

TGATGACCCA TGGCTCTGCA GCTGAG6CAC GTGAGAATGA TATGAGTAAC 450 
MSH GSA AEAR END MSN 

AACAATGGGT ATAGCCCTAG OGGTGAGTCG GCCTCATCTT TGCCTTTACC 500 
NNGY SPS GDS ASSL PLP 

AAGTTCAGGA AGGACTGATA GCTTCCTCAA TGCTATTTGG GGTTTGTCTC 550 
SSG RTDS FLN AIW GLSP 

CAAAGGTCAT GGTGGTCACT GAGCAAGACT CAGACCACAA CGGCTCCACA 600 
KVM VVT EODS OHN GST 
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10 20 30 40 50 

1234567890 1234567890 1234567890 1234567890 1234567890 
CTAATGGAGA GGCTATTAGA ATCACTTTAC ACCTACGCAG CATTGTTTGA 650 
LMER LLE SLY TYAA LFD 

TTGCTTGGAA ACAAAAGTTC CAAGAACGTC TCAAGATAGG ATCAAAGTGG 700 
CLE TKVP RTS QDR IKVE 

AGAAGATGCT CTTCGGGGAG GAGATCAAGA ACATCATATC CTGCGACGGA 750 
KML FGE,EIKN IIS CEG 

TTTGAGAGAA GAGAAAGACA CGAGAAGCTT GAGAAATGGA GCCAGAGGAT 800 
FERR ERH EKL EKWS OR! 

DGATTTGGCT GGTTTTGGGA ATGTTCCTCT TAGCTATTAT GCGATGTTGC 850 
OLA GFGN VPL SYY AMLO 

AGGCTAGGAG ATTGCTTCAA GGGTGCGGTT TTGATGGGTA TAGAATCAAG 900 
ARR LLO GCGF DGY RIK 

GAAGAGAGCG GGTGCGCAGT AATTTGCTGG CAAGATCGAC CTCTATACTC 950 
EESG CAV ICW ODRP LYS 

GGTATCAGCT TGGAGATGCA GGAAGTGAAT GATATATTAC AGTTTGTCTT 1000 
VSA WRCR KX 

CTATTTTGGT TATGAGCAGA GTCCCTTTCT TTTTTGTATA CATGGGGACA 1050 

CAATCTTAGT TGTTTTGTGA TGGTGACTTT CTGTCTCTTT ATGCTATTTT 1 100 

GGCTTAAATG CTTCTACTGC CTCTGCATGT AAAGCCTTTG TGTGTTGGTT 1 150 

CAATTTGGTC TGGTGTGGGT GTAATACCAA ACCAAATCCA ATTTGAGCTG 1200 

AAGATAACTA ATTTGATGAT CGGCTCGTGC C 1231 
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Zm-sc 1 1 

10 20 30 40 50 

CCAGGACGCGTTCGAGCGGGAGGAGCGTGTGCACATCATCGACCTCGACA 
OEAFEREERVHl IDLDI 

60 70 80 90 100 

tcatgca(x»x:tgcagtggccgggcctcttccacatccttgcctccogc 
moglowpglfhilasr 
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10 20 30 40 50 

1234567890 1234567890 1234567890 1234567890 1234567890 
CACGCGTCCG TCAAAGGATA CAACCATGTA CACATAATTG ACTTTTCCCT 50 
HASV KGY NHV HIID FSL 

GATGCAAGGT CTCCAGTGGC CGGCACTCAT GGATGTCTTC TCCGCCCGTG 100 
MOG LOWP ALM DVF SARE 

AGGGTGGGCC ACCAAAGCTC CGAATCACAG GCATTGGCCC GAACCCAATA 150 
GGP PKL RITG IGP NPI 

GGTGGCCGTG ACGAGCTCCA TGAAGTGGGA ATTCGCCTCG CCAAGTATGC 200 
GGRD ELH EVG IRLA KYA 

ACACTCGGTG GGTATCGACT TCACTTTCCA GGGAGTCTGT GTCGATCAAC 250 
HSV GIOF TFQ GVC VDOL 

TTGATAGGTT GTGCGACTGG ATGCTTCTCA AACCAATCAA AGGAGAGGCA 300 
DRL CDW MLLK PIK GEA 

GTTGCCATAA ACTCCATCCT ACAACTCCAT CGCCTCCTCG TTGACCCAGA 350 
VAIN SIL OLH RLLV OPD 

TGCAAACCCA GTGGTGCCCG CACCAATAGA TATCCTCCTC AAATTGGTCA 400 
ANP VVPA PID ILL KLVI 

TCAAGATAAA CCCCATGATC TTCACGGTGG TTGAGCATGA GGCAGATCAC 450 
KIN PMl FTVV EHE ADH 

AACAGACCAC CACTACTAGA GAGGTTCACT AATGCCCTCT TCCACTATGC 500 
NRPP LLE RFT NALF HYA 

GACCATGTTT GACTCTTTGG AGGCCATGCA TCGTTGTACC AGTGGTAGAG 550 
TMF DSLE AMH RCT SGRD 

ACATCACCGA CTCACTCACA GAGGTGTACC TTCGAGGTGA GATTTTTGAC 600 
ITD SLT EVYL RGE IFO 
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10 20 30 40 50 

123456789Q 1734567890 1?^4fifi7R90 l?34567fl9n 191 4567890 
ATTGTCTGCG GCGAGGGCAG TGCACGCACC GAACGTCATG AGTTGTTTGG 650 
IVCG EGS ART ERHE LFG 

TCACTGGAGG GAGAGGCTCA CCTATGCTGG GCTAACTCAA GTGTGGTTCG 700 
HWR ERLT YAG LTQ VWFO 

ACCCCGATGA GGTTGACACG CTAAAAGACC AGTTGATCCA TGTGACATCC 750 
PDE VDT LKOQ LIH VTS 

TTATCTGGCT CTGGGTTCAA CATCCTAGTG TGTGATGGCA GCCTTGCACT 800 
LSGS GFN ILV COGS LAL 

AGCGTGGCAT AATCGCCCGT TATATGTGGC AACAGCTTGG TGTGTGACAG 850 
AWH NRPL YVA TAW CVTG 

GAGGAAATGC TGCCAGTTCC ATGGTTGGCA ACATCTGTAA GGGTACAAAT 900 
GNA ASS MVGN ICK GTN 

GATAGTAGAA GAAAGGAAAA CCGTAATGGA CCCATGGAGT AGCAGGAAGA 950 
DSRR KEN RNG PMEX 

ATAACCATGT CATGAGCAAA TCGATCAAGT AATAAAATGC ACTGATGACA 1000 

TGCATGGTGA TCTAAAGTTT TTTTGCGTGA ATGTGCAATG ACGAATTGTT 1050 

CAATTTGAAT AACCTAATCA TGAGACTCAA AAAAAAAAAA AAA 1093 



FIG.11B2 
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